Customer Discovery Interview Playbook

How do I run 20-40 customer discovery interviews — scripts, synthesis, constraint: 8/10 must describe problem unprompted?

Purpose

This recipe produces a validated problem hypothesis backed by 20-40 structured customer discovery interviews, a synthesis report with pattern analysis, and a go/no-go scorecard based on a clear threshold: at least 8 out of 10 interviewees must describe the target problem unprompted. The output replaces founder intuition with evidence-based problem validation that feeds directly into solution design and MVP scoping. [src1]

Prerequisites

ICP definition available from business/customer-research — with firmographic and behavioral criteria specific enough to screen recruits
Problem hypothesis — 1-2 sentence problem statement with assumed severity and frequency
Scheduling tool — Calendly or Cal.com account (free tier: 1 event type)
Video conferencing — Zoom or Google Meet account (free tier: 40-min meetings)
Recording consent language — one-sentence disclosure to read at interview start
Synthesis workspace — Dovetail, Google Sheet, or Notion database with tagging taxonomy

Constraints

Never mention your solution or product idea during problem interviews. Introducing your idea contaminates responses. [src1]
Record every interview with explicit consent. Memory-based notes miss 40-60% of actionable detail. [src2]
Recruit from target ICP segments only. Friends, family, and convenience samples produce false validation. [src1]
Synthesize within 48 hours of each interview batch. [src3]
Minimum 20 interviews for pattern saturation. Continue until 3 consecutive interviews yield no new themes. [src6]
Validation threshold: 8/10 interviewees must describe the problem unprompted before the problem is considered validated.

Tool Selection Decision

Which path?
├── User wants free tools AND simple workflow
│   └── PATH A: Free Lean — Google Calendar + Google Meet + Google Sheets
├── User wants free tools AND better synthesis
│   └── PATH B: Free Pro — Cal.com + Zoom (free) + Notion
├── User wants paid tools AND automated workflow
│   └── PATH C: Paid Standard — Calendly + Zoom Pro + Dovetail
└── User wants maximum efficiency AND scale beyond 40 interviews
    └── PATH D: Paid Scale — Calendly + Zoom + Dovetail + User Interviews

Path	Tools	Cost	Speed (30 interviews)	Synthesis Quality
A: Free Lean	Google Calendar + Meet + Sheets	$0	4-6 weeks	Manual — adequate
B: Free Pro	Cal.com + Zoom free + Notion	$0	4-5 weeks	Structured — good
C: Paid Standard	Calendly + Zoom Pro + Dovetail	$45-75/mo	3-4 weeks	AI-assisted — excellent
D: Paid Scale	Calendly + Zoom + Dovetail + User Interviews	$150-300/mo	2-3 weeks	Automated — excellent

Execution Flow

Step 1: Recruit Interviewees

Duration: 5-10 days · Tool: Scheduling tool + outreach channels

Recruit 35-50 candidates to yield 25-40 completed interviews (expect 30-40% no-show/decline rate). Recruit only from your ICP. [src1]

Channel                          Signal Quality    Response Rate
────────────────────────────────────────────────────────────────
1. Warm intro via mutual contact    Highest           40-60%
2. LinkedIn personalized outreach   High              15-25%
3. Industry community/Slack/Reddit  High              10-20%
4. Cold email to ICP list           Medium            5-15%
5. Social media post (Twitter/X)    Medium            3-10%
6. Paid recruiting panel            Variable          80-95%

Verify: 35+ candidates scheduled within 10 days · If failed: Add a second recruitment channel or offer $20-50 gift card incentive

Step 2: Prepare Interview Script

Duration: 1-2 hours · Tool: Document editor

Build a semi-structured script following Mom Test principles: ask about their life, not your idea. Focus on past behavior, not future intentions. [src1]

OPENING (2 min):
"Thanks for making time. I'm researching [problem domain] to understand
the real challenges. No right or wrong answers. Mind if I record?"

CONTEXT (3 min):
1. "Tell me about your role. What does a typical week look like?"
2. "Where does [problem domain] fit into your priorities?"

PROBLEM EXPLORATION (15 min):
3. "Walk me through the last time you dealt with [broad topic area]."
4. "What's the hardest part about [the process they described]?"
5. "How are you solving that today?"
6. "What have you tried that didn't work?"
7. "If you could wave a magic wand, what would change?"

COMMITMENT TESTING (3 min):
8. "Who else on your team deals with this?"
9. "Would you be open to a follow-up conversation?"

CLOSE (2 min):
"Is there anything I should have asked but didn't?"

Verify: Script reviewed by at least one other person. One pilot interview completed. · If failed: Cut questions if pilot runs over 30 minutes

Step 3: Conduct Interviews

Duration: 2-4 weeks · Tool: Video conferencing + transcription · Rate limit: Max 3 interviews per day

Run interviews following the script. Adapt based on emerging themes. Log every interview immediately after completion.

Per-interview checklist:
□ Recording consent given
□ Recording started (video + audio)
□ Script available but NOT read verbatim
□ Timer visible — stay within 25-30 minutes
□ Exact quotes captured (not paraphrases)
□ Commitment ask completed at end
□ 5-minute debrief immediately after

Verify: At least 80% of scheduled interviews completed. Tracking log fully populated. · If failed: If no-show rate exceeds 40%, send 24h and 1h reminders

Step 4: Synthesize Interview Data

Duration: 4-8 hours (batch after every 8-10 interviews) · Tool: Dovetail, Notion, or Google Sheets

Tag and code transcripts, cluster into themes, build evidence table. Synthesize in batches of 8-10, not all at the end. [src2] [src7]

Per-batch synthesis:
1. Tag: problem statements, severity, frequency, current solutions, outcomes
2. Cluster: group by theme, name each cluster
3. Count: frequency per theme, average severity, prompted vs unprompted
4. Build evidence table: Problem Theme | Frequency | Severity | Quotes

Verify: Every transcript tagged. Theme clusters are mutually exclusive. · If failed: If themes overlap, split into sub-themes and re-tag

Step 5: Score and Decide (Go/No-Go)

Duration: 1-2 hours · Tool: Spreadsheet or synthesis tool

Apply the validation scorecard with four metrics: unprompted mention rate (≥ 80%), severity (≥ 3.0/5), active solution seeking (≥ 30%), and commitment signal (≥ 25%).

DECISION MATRIX:
All 4 metrics at validated/strong level    → GO
Metric 1 validated + 2 of 3 others strong  → CONDITIONAL GO
Metric 1 not validated                     → NO GO (pivot or reframe)
Any single metric below minimum            → INVESTIGATE further

Verify: Scorecard uses tallied data from tracking log, not impressions · If failed: If metric 1 is 60-79%, conduct 10 more interviews with reframed problem language

Step 6: Produce Deliverables

Duration: 2-4 hours

Generate the synthesis report, problem validation scorecard, raw data archive, and design partner list for downstream consumption.

Deliverables:
1. interview-synthesis-report.md — evidence-backed findings
2. problem-validation-scorecard.json — machine-readable scorecard
3. interview-tracking-log.csv — all interviews with coded data
4. design-partners.csv — qualified follow-up candidates

Output Schema

{
  "output_type": "problem_validation_scorecard",
  "format": "JSON",
  "columns": [
    {"name": "problem_hypothesis", "type": "string", "required": true},
    {"name": "total_interviews", "type": "number", "required": true},
    {"name": "unprompted_mention_rate", "type": "number", "required": true},
    {"name": "avg_severity", "type": "number", "required": true},
    {"name": "active_solution_rate", "type": "number", "required": true},
    {"name": "commitment_rate", "type": "number", "required": true},
    {"name": "decision", "type": "string", "required": true},
    {"name": "top_problems", "type": "array", "required": true},
    {"name": "design_partners", "type": "number", "required": false}
  ],
  "expected_row_count": "1",
  "deduplication_key": "problem_hypothesis"
}

Quality Benchmarks

Quality Metric	Minimum Acceptable	Good	Excellent
Interviews completed	20	30	40+
Unprompted problem mention rate	≥ 60%	≥ 80%	≥ 90%
Average severity score	≥ 3.0/5.0	≥ 3.5/5.0	≥ 4.0/5.0
Active solution seeking rate	≥ 30%	≥ 50%	≥ 70%
Transcript tagging completeness	80%	95%	100%
Synthesis lag	< 72 hours	< 48 hours	< 24 hours

If below minimum: If fewer than 20 interviews completed, extend recruitment. If unprompted rate is below 60% after 20 interviews, reframe problem hypothesis and conduct 10 more.

Error Handling

Error	Likely Cause	Recovery Action
High no-show rate (> 40%)	Weak outreach or wrong channel	Rewrite outreach emphasizing value exchange, add calendar reminders at 24h and 1h
Only compliments, no real problems	Questions too leading or solution mentioned	Reset script to pure Mom Test questions, remove solution hints [src1]
All interviewees describe different problems	ICP too broad or hypothesis too vague	Narrow ICP segment, pick one job-to-be-done, re-recruit [src4]
Transcripts missing or corrupt	Recording failed or consent not given	Check recording at minute 1. Use backup note-taker. Re-interview if critical
Synthesis themes overlapping	Tag taxonomy too broad or inconsistent	Rebuild taxonomy with mutual exclusivity, re-tag from highlights
Interviewer talks > 30%	Script read verbatim or pitching	Practice active listening, review own talk ratio after each interview [src1]

Cost Breakdown

Component	Free Tier	Paid Tier	At Scale
Scheduling (Calendly/Cal.com)	$0 (1 event type)	$10/mo	$16/mo
Video conferencing (Zoom/Meet)	$0 (40-min limit)	$13/mo	$20/mo
Transcription (Otter.ai/built-in)	$0 (300 min/mo)	$10/mo	$24/mo
Synthesis (Sheets/Dovetail)	$0 (Google Sheets)	$29/mo	$49/mo
Participant incentives	$0 (value exchange)	$20-50/interview	$50-100/interview
Total for 30 interviews	$0	$62-112/mo + incentives	$109-159/mo + incentives

Anti-Patterns

Wrong: Asking "Would you use a product that does X?"

Hypothetical questions produce hypothetical answers. People are terrible at predicting their own future behavior, yielding compliments and false positives that lead to products nobody uses. [src1]

Correct: Ask about past behavior and current pain

Ask "Tell me about the last time you dealt with [problem domain]" and listen for the problem to emerge naturally. Past behavior is the best predictor of future behavior.

Wrong: Stopping at 10-12 interviews because patterns "seem clear"

Early pattern recognition is often confirmation bias. The first 10 interviews tend to confirm your hypothesis because you unconsciously seek confirmation. [src6]

Correct: Continue until 3 consecutive interviews yield no new themes

The saturation rule: at least 20 interviews, continue until 3 in a row surface zero new problem themes. This typically happens between interview 25 and 35.

Wrong: Synthesizing all interviews in one marathon session at the end

Batch synthesis at the end produces shallow analysis and loses contextual memory of tone, hesitation, and emphasis. [src2]

Correct: Synthesize in batches of 8-10 within 48 hours

Process each batch while conversations are fresh. Update theme clusters and tracking log. Adjust questions for the next batch based on emerging patterns.

When This Matters

Use this recipe when a founder or product team needs to validate that a real, severe, frequent problem exists before investing in building a solution. It produces structured evidence that replaces gut feeling with data. The output feeds directly into solution interview design, MVP scoping, and investor pitch evidence. Requires an ICP definition as input.