PMF Signals vs Noise Scorecard

Type: Assessment Confidence: 0.88 Sources: 8 Verified: 2026-04-15

Purpose

This scorecard separates real product-market fit signals from noise by scoring a startup across three independent dimensions — retention (do users come back?), engagement (is usage deepening?), and willingness-to-pay (do they pay without discounts?) — each further broken into weighted sub-signals with calibrated benchmarks. Use it when a founder or investor needs to answer "Is PMF real here, or are we fooling ourselves with vanity metrics?" The output is a composite 0-100 score with per-dimension diagnostics that route directly to the correct next card. [src1, src2]

PMF is not binary and not permanent — it is a progressive set of tests, each more definitive than the last. [src2] This scorecard exposes false positives (top-line growth from paid acquisition, inflated NPS without retention, free users with no pricing validation) and false negatives (slow-burn B2B products with strong retention but low signup volume). [src3, src6]

Constraints

Assessment Dimensions

Dimension 1: Retention (weight: 45%)

What this measures: Whether users who arrive stick around long enough to matter. Retention is the single most trustworthy PMF signal because it is behavioral, cohort-isolated, and harder to fake than top-line metrics. [src2, src8]

Sub-signal 1A: Retention Curve Flatness

ScoreLevelDescriptionEvidence
1Ad hocRetention decays toward zero; no plateau by Month 6Cohort curve keeps falling, <5% retained at M6
2EmergingDecay slows but no clear plateau yetSlope flattening; 10-20% retained at M6
3DefinedClear plateau emerging in best segmentsPlateau visible at 20-30% for at least one segment
4ManagedFlat plateau across primary segments30-40% plateau sustained 6+ months
5OptimizedSmiling retention (curve turns up over time)Plateau rises — expansion and re-engagement dominate churn [src8]

Red flags: Aggregate retention looks healthy but cohort view shows each new cohort performs worse than the last — classic sign of new-user acquisition masking deteriorating fit. [src3]

Quick diagnostic question: "Plot retention by weekly or monthly cohort for the last 6 months — does the curve flatten for any segment?"

Sub-signal 1B: Churn Rate (monthly, logo)

ScoreLevelDescriptionEvidence
1Ad hocB2C >10%/mo, B2B SMB >5%/mo, B2B ENT >2%/moCatastrophic churn
2EmergingB2C 7-10%/mo, B2B SMB 3-5%/mo, B2B ENT 1-2%/moProblematic
3DefinedB2C 5-7%/mo, B2B SMB 2-3%/mo, B2B ENT 0.5-1%/moIndustry median
4ManagedB2C 3-5%/mo, B2B SMB 1-2%/mo, B2B ENT <0.5%/moStrong
5OptimizedB2C <3%/mo, B2B SMB <1%/mo, B2B ENT near-zeroBest-in-class [src6]

Red flags: Revenue churn lower than logo churn = small customers leaving while big ones stay (common, acceptable). Revenue churn higher than logo churn = bigger customers leaving = severe PMF problem. [src6]

Quick diagnostic question: "What is your monthly logo churn and monthly revenue churn, measured as a cohort for the last 3 months?"

Sub-signal 1C: Segment Concentration

ScoreLevelDescriptionEvidence
1Ad hocNo segment shows differentiated retentionAll cohorts decay at similar poor rates
2EmergingOne segment retains slightly betterTop segment 1.5x better than average
3DefinedClear "best segment" identified with materially better curveTop segment 2-3x better
4ManagedBest segment retains at plateau; others don'tClear high-expectation customer profile documented [src1]
5OptimizedBest segment retains + dominates revenue + is expandingTop segment >50% of revenue, growing share

Red flags: Scoring retention across all users instead of segmenting — hides the real fit that exists for a subset. Balfour: PMF is rarely across the whole market; it lives in segments. [src2]

Quick diagnostic question: "Can you point to one customer segment (by persona, use case, or size) that retains at >2x the rate of everyone else?"

Dimension 2: Engagement (weight: 30%)

What this measures: Whether active users develop habit and depth of use, not just logins. Engagement distinguishes shallow engagement (retention risk) from deep integration (PMF). [src6]

Sub-signal 2A: Core Action Frequency

ScoreLevelDescriptionEvidence
1Ad hocNo defined "core action" or most users never perform it<20% of signups perform core action
2EmergingCore action defined; minority perform it20-40% perform core action at least once
3DefinedMajority perform core action in first session40-60% perform it; some repeat
4ManagedHabitual use — core action performed weekly60-80% return within 7 days to repeat [src7]
5OptimizedCore action performed multiple times per week by most actives>80% weekly active; DAU/MAU >40% for consumer

Red flags: Users log in but never perform the action that creates value — common in signup-optimized funnels. [src6]

Quick diagnostic question: "Define the single action that delivers the core value. What % of signups perform it? How often do retained users repeat it?"

Sub-signal 2B: Depth of Adoption

ScoreLevelDescriptionEvidence
1Ad hocUsers touch only 1 featureShallow; high churn risk
2EmergingMost users touch 2 featuresLimited depth
3DefinedRetained users touch 3+ featuresMulti-feature adoption by actives
4ManagedRetained users create persistent artifacts (projects, integrations, data)Lock-in via artifacts [src6]
5OptimizedUsers integrate product into daily workflow; switching cost is highProduct embedded in user routine

Red flags: Signup spike from launch/press without feature depth = news-cycle noise, not PMF. [src3]

Quick diagnostic question: "Of users who are still active after 30 days, how many unique features/actions have they used? Do they create persistent artifacts (projects, integrations, teams)?"

Sub-signal 2C: Organic / Word-of-Mouth Share

ScoreLevelDescriptionEvidence
1Ad hoc<10% of new users arrive organicallyGrowth entirely paid-dependent
2Emerging10-20% organicSome spontaneous interest
3Defined20-35% organicMeaningful word-of-mouth emerging
4Managed35-50% organicStrong referral loop; viral coefficient approaching 1
5Optimized>50% organic; users recruit other users without being askedNetwork effects or genuine love [src4]

Red flags: Paid acquisition can manufacture any growth curve — if organic share is <10% and retention is mediocre, you are buying a graph, not PMF. [src3]

Quick diagnostic question: "What % of new signups this month came from unpaid channels (direct, organic search, referral, word-of-mouth)?"

Dimension 3: Willingness-to-Pay (weight: 25%)

What this measures: Whether users commit economically — paying full price without discount, sustaining payments without churn, and expanding spend. Stated intent ("I would pay for this") is noise; actual payment is signal. [src2]

Sub-signal 3A: Paid Conversion (where applicable)

ScoreLevelDescriptionEvidence
1Ad hocFree-only or trial → paid conversion <1%No willingness signal
2EmergingTrial → paid 1-3% (B2C) or 5-10% (B2B freemium)Weak signal
3DefinedTrial → paid 3-5% (B2C) or 10-20% (B2B freemium)Industry-typical
4ManagedTrial → paid 5-10% (B2C) or 20-35% (B2B freemium)Strong
5OptimizedTrial → paid >10% (B2C) or >35% (B2B freemium)Best-in-class [src6]

Red flags: Discounts, coupons, or "free forever" converting users — that is a price signal, not a fit signal. Strip discounts and re-measure.

Quick diagnostic question: "What % of users who start a free trial or free plan convert to a paid plan at full list price (no discount)?"

Sub-signal 3B: Price Stability / Discount Dependency

ScoreLevelDescriptionEvidence
1Ad hoc>50% of revenue from discounted dealsSellers buy revenue; value unproven
2Emerging30-50% discountedHeavy discounting to close
3Defined15-30% discountedNormal negotiation
4Managed<15% discounted; full-price close rate growingCustomers accept value
5OptimizedNo discount needed; attempted price increases do not raise churnPricing power

Red flags: Sales team resorts to discount/extended terms to close; churn spikes when discounts expire. [src6]

Quick diagnostic question: "What % of closed deals in the last 90 days involved a discount greater than 10% off list?"

Sub-signal 3C: Net Revenue Retention (NRR) / Expansion

ScoreLevelDescriptionEvidence
1Ad hocNRR <80%Business shrinking on existing base
2EmergingNRR 80-95%Leakage; no expansion
3DefinedNRR 95-105%Break-even on expansion vs churn
4ManagedNRR 105-120%Healthy expansion; existing base drives growth
5OptimizedNRR >120%Best-in-class; customers pay more over time [src6]

Red flags: NRR calculated with new logos included — that is GRR plus new sales, not NRR. NRR must be same-cohort only.

Quick diagnostic question: "For the cohort of customers from 12 months ago, what is their revenue today as a percentage of their revenue 12 months ago (excluding any new customers acquired since)?"

Scoring & Interpretation

Overall Score Calculation

Each sub-signal scores 1-5. Compute dimension scores as simple averages of their 3 sub-signals, then apply dimension weights:

Retention Dimension      = avg(1A, 1B, 1C)       weight 0.45
Engagement Dimension     = avg(2A, 2B, 2C)       weight 0.30
Willingness-to-Pay Dim.  = avg(3A, 3B, 3C)       weight 0.25

Composite Score (1-5)    = 0.45*Retention + 0.30*Engagement + 0.25*WTP
Composite (0-100 scale)  = (Composite - 1) / 4 * 100

Retention is weighted highest because it is the hardest signal to fake and the most predictive of long-term outcomes. [src2, src8] Willingness-to-pay is weighted lower only because it may not apply to pre-monetization products — when applicable, treat low WTP scores as a hard gate regardless of composite.

Score Interpretation

Overall Score (0-100)Maturity LevelInterpretationRecommended Next Step
0 - 19Critical — signals are noiseClaimed PMF is not supported by data. Vanity metrics dominate. Do not scale.Return to customer discovery; fetch MVP testing framework
20 - 39Developing — weak signalsEarly positive signals but heavy noise. Retention not yet proven. High risk of false positive.Run full PMF engine — fetch PMF measurement
40 - 59Competent — partial PMFPMF exists in a segment but not broadly. Identify and double down on the high-expectation customer.Formalize segment focus — fetch PMF measurement
60 - 79Advanced — clear PMFMultiple independent signals confirm PMF. Can begin scaling in strongest segment.Full scaling gate check — fetch scaling readiness
80 - 100Best-in-class — undeniable PMFAndreessen's "you'll feel it" territory: retention plateau + deep engagement + unforced payment.Build scaling engine — fetch growth model design [src4]

Dimension-Level Action Routing

Weak Dimension (Score < 3)Fetch This Card
RetentionProduct-Market Fit Measurement — focus on cohort analysis and segment isolation
EngagementProduct-Market Fit Measurement — apply Superhuman PMF engine to high-expectation customer [src1]
Willingness-to-PayValue-Based Pricing SaaS — re-validate pricing with real behavior, not surveys

Benchmarks by Segment

Scores mean different things at different stages and models. Applying one threshold across segments produces misleading diagnoses. [src7]

SegmentExpected Average Score (0-100)"Good" Threshold"Alarm" Threshold
Pre-seed (MVP, <6 months)15-30>35<15 — too early to measure meaningfully
Seed (early revenue, 6-18 months)30-50>50<25
Post-seed (growing revenue, 18-30 months)45-65>60<35
Series A+ (scaling, 30+ months)55-75>70<50 — PMF should be solid by this stage [src7]
Consumer (B2C subscription)35-55>55<25 — B2C churn floors are higher
B2B SaaS SMB45-65>60<35
B2B SaaS Enterprise55-75>65<45 — enterprise retention floors are much higher
Marketplace30-50 (two-sided is harder)>55<25 — need to score both sides

[src3, src6, src7]

Common Pitfalls in Assessment

When This Matters

Use this scorecard when: (1) a founder or investor disagrees about whether PMF exists; (2) before any decision to increase burn rate, hire a sales team, or raise a Series A; (3) after a launch moment (press, Product Hunt) to distinguish sustained signal from news-cycle noise; (4) quarterly as part of board reporting to track PMF trajectory, not just magnitude. [src4, src7]

Related Units