This recipe produces a fully operational customer retention system — churn root causes identified and ranked from 12 months of data evidence, a composite health scoring model deployed with automated alert triggers, smart dunning sequences recovering 50-70% of failed payments, segment-specific intervention playbooks documented for every root cause x ACV tier combination, and a live retention dashboard tracking GRR, NRR, and intervention conversion rates with weekly review cadence. The output is a self-sustaining retention engine that proactively identifies at-risk customers 60+ days before cancellation and routes them through proven save motions. 60-70% of SaaS churn is preventable with systematic intervention. [src2]
Which path?
├── ACV < $5K AND budget = free
│ └── PATH A: Spreadsheet + Free Analytics — Sheets + Mixpanel Free + Stripe Smart Retries
├── ACV < $5K AND budget up to $15K/yr
│ └── PATH B: Starter CS Platform — Totango/Vitally + Mixpanel + Churnkey
├── ACV $5K-$50K AND budget up to $60K/yr
│ └── PATH C: Mid-Market CS — ChurnZero + Mixpanel Growth + Recurly
└── ACV $50K+ OR budget $60K+/yr
└── PATH D: Enterprise CS — Gainsight + Amplitude Enterprise + custom ML models
| Path | Tools | Cost/yr | Implementation | Best For |
|---|---|---|---|---|
| A: Spreadsheet + Free | Sheets, Mixpanel Free, Stripe | $0-$500 | 2-4 weeks | <200 customers, bootstrapped |
| B: Starter CS | Totango/Vitally, Mixpanel, Churnkey | $6K-$15K | 4-6 weeks | 200-1,000 customers, SMB-focused |
| C: Mid-Market CS | ChurnZero, Mixpanel Growth, Recurly | $20K-$60K | 6-8 weeks | 1,000-5,000 customers, mid-market |
| D: Enterprise CS | Gainsight, Amplitude, custom ML | $60K-$200K | 3-6 months | 5,000+ customers, enterprise ACV |
Duration: 1-2 weeks · Tool: SQL/spreadsheet + billing system export
Export 12 months of customer data. Classify every churn event: voluntary vs involuntary, then into seven root-cause buckets (product gaps, onboarding failure, value realization failure, service issues, competitive displacement, budget change, champion departure). Record ACV, segment, tenure, last-90-day usage, and stated vs behavioral reason. Compute fixability x revenue impact x frequency matrix to prioritize the top 3-5 causes. [src2]
Root-cause buckets (Rework 2026 Guide):
a) Product gaps/bugs e) Competitive displacement
b) Onboarding failure f) Budget/priority change
c) Value realization failure g) Champion departure
d) Service/relationship issues
Priority = fixability (1-5) * revenue_impact (1-5) * frequency (1-5)
Top 3-5 by priority score get intervention playbooks
Verify: Every churn event classified; voluntary/involuntary split calculated; top 3-5 root causes identified with % attribution · If failed: If <50 churn events, extend to 18-24 months; backfill from support tickets and CSM notes
Duration: 1-2 weeks · Tool: Mixpanel (retention report) or SQL + spreadsheet
Build cohort retention curves segmented by signup month, segment, channel, and plan tier. Measure at Day 7, 30, 60, 90, 180, 365. Identify the "magic number" usage threshold. Real examples: Slack daily use in first week = 90% retention vs 30%; Segment 3+ integrations = 95% vs 40%; Zendesk onboarding redesign = 22% early churn reduction. [src1] [src2]
Churn benchmarks (B2B SaaS 2026):
Enterprise monthly: 0.5-1.0% Mid-market: 1-3%
SMB monthly: 3-7% B2C subscription: 6.5-8%
NRR benchmarks (939 companies, Optifai Q1-Q3 2025):
Enterprise (ACV >$100K): median 118%, top quartile >130%
Mid-market ($25K-$100K): median 108%, top quartile >120%
SMB (<$25K): median 97%, top quartile >105%
Verify: Retention curves for 4+ segments; "magic number" identified; worst-performing cohort found · If failed: Merge adjacent months or reduce segment granularity
Duration: 2-3 weeks · Tool: CS platform or spreadsheet
Build composite health score with 5-7 signals weighted by correlation with actual churn. Reverse-engineer from outcomes: compare churned vs retained customers 90-180 days before outcome. Trend features are 3x more predictive than point-in-time. NPS drops of 3+ points QoQ predict 4x baseline churn. Cap any single signal at 25% max. [src4] [src8]
Health Score Template (adjust weights to your data):
Product Usage (40%): login frequency, feature breadth, usage trend (30-day slope)
Support Sentiment (20%): ticket volume vs baseline, CSAT on last 3 tickets
Financial (20%): payment failures (90 days), contract type (annual vs monthly)
Relationship (20%): NPS trend (QoQ), exec sponsor engagement
Thresholds: Green 80-100 | Yellow 50-79 | Red 0-49
Target distribution: ~60% green / 25% yellow / 15% red
Alert Triggers:
- Health drops >20 pts in 14 days → CSM notification
- Logins drop >40% vs 30-day avg → re-engagement email
- NPS drops 3+ pts QoQ → exec outreach within 48 hours
- 5+ support tickets in 30 days → escalation flag
Verify: Scores for all customers; ~60/25/15 distribution; back-test identifies >60% of eventual churners as yellow/red · If failed: Replace lowest-correlation signals; cap dominant signals at 25%; adjust thresholds
Duration: 1-2 weeks · Tool: Stripe Smart Retries / Recurly / Churnkey / Churn Buster
Deploy smart dunning sequences. Involuntary churn is 20-40% of total and the highest-ROI fix. Smart retry recovers 70% of failed payments vs 30% for basic retries. Pre-expiry card update alerts prevent 40%+ of failures. Multi-channel approach (email + in-app + SMS) maximizes recovery. [src5]
Dunning Sequence (multi-channel):
Day 0: Smart retry at algorithm-optimized time
Day 1: Retry #2 + email with 1-click card update link
Day 3: Retry #3 + urgency email
Day 5: Retry #4 + in-app banner + SMS (if opted in)
Day 7: Final retry + "account pauses in 48h" email
Day 9: Account PAUSED (not canceled) + 1-click reactivation
Pre-dunning prevention (deploy first):
30 days before card expiry → email notification
14 days → email + in-app notification
7 days → final reminder
Recovery targets: Minimum 40% | Good 55% | Excellent 68-70%
Verify: Dunning active; pre-expiry alerts sending; recovery rate >50% · If failed: Add SMS + in-app; test different retry timing; consider dedicated dunning tool for 10-20 pp improvement
Duration: 2-4 weeks · Tool: CS platform + CRM
Design interventions per root cause x ACV tier. Each playbook: trigger condition, action steps, owner, timeline, expected save rate. Save hierarchy: solve the problem → offer scope reduction → discount (last resort). Proactive retention converts at 60-80% vs 15-20% reactive. Intervene 60+ days before expected churn. [src2] [src7]
| Root Cause | SMB (<$5K ACV) | Mid-Market ($5K-$50K) | Enterprise ($50K+) |
|---|---|---|---|
| Onboarding failure (<90 days) | Automated drip + in-app guides + tooltips. Save: 25-35% | CSM-led 30-day program + weekly check-ins. Save: 40-55% | Dedicated onboarding mgr + implementation team. Save: 60-75% |
| Value realization failure (3-12 mo) | Feature discovery campaigns + use-case templates. Save: 20-30% | QBR + adoption scorecard + training sessions. Save: 35-50% | Custom training workshop + integration optimization. Save: 50-65% |
| Competitive threat | Comparison content + value recap + switch-cost analysis. Save: 15-25% | CSM competitive response + feature gap bridge. Save: 25-40% | Exec meeting + roadmap preview + custom dev. Save: 40-55% |
| Champion departure | Automated new-user onboarding trigger. Save: 20-30% | CSM re-onboarding + value recap. Save: 35-50% | Exec-to-exec + multi-thread account. Save: 45-60% |
| Budget/downtrade | Annual discount or pause (max 3 mo). Save: 30-40% | Scope reduction vs full cancel. Save: 40-55% | Custom contract restructure + value guarantee. Save: 55-70% |
Verify: 9+ playbooks (3 root causes x 3 tiers minimum); each with trigger, steps, owner, save rate; CS team reviewed · If failed: Prioritize highest-ACV segments; automate SMB interventions entirely
Duration: 1-2 weeks · Tool: CS platform dashboard, BI tool, or Google Sheets
Build live dashboard with weekly review cadence. Track lagging indicators (GRR, NRR, churn rate) and leading indicators (health distribution, intervention velocity). NRR >110% commands 2-3x higher valuations. 10-point NRR increase boosts valuation 20-30%. Companies with 120%+ NRR command 10-12x ARR vs 6-8x at NRR ~100%. [src3] [src6]
Primary Metrics (weekly review):
- GRR: target 90%+ (top: 95%+)
- NRR: Enterprise median 118%, Mid-market 108%, SMB 97%
- Monthly churn rate by segment
- Health score distribution (% green/yellow/red)
Intervention Metrics (weekly):
- At-risk accounts identified | Interventions triggered vs completed
- Save rate by type, root cause, segment
- Time: health decline to first intervention (target <7 days)
- Dunning recovery rate (target >50%)
Leading Indicators (daily):
- Login frequency trend (7-day vs 30-day rolling avg)
- Support ticket volume spike (>2x baseline)
- Payment failure rate | NPS trend
Verify: Dashboard live with daily refresh; first weekly review completed; baseline metrics recorded · If failed: Start with manual spreadsheet; automate incrementally
{
"output_type": "retention_system",
"format": "configured platform + documents",
"columns": [
{"name": "customer_id", "type": "string", "description": "Unique customer identifier", "required": true},
{"name": "health_score", "type": "number", "description": "Composite health score 0-100", "required": true},
{"name": "health_status", "type": "string", "description": "green/yellow/red", "required": true},
{"name": "health_trend", "type": "string", "description": "improving/stable/declining (30-day slope)", "required": true},
{"name": "primary_risk_factor", "type": "string", "description": "Top risk signal from health model", "required": false},
{"name": "assigned_playbook", "type": "string", "description": "Intervention triggered (root cause x segment)", "required": false},
{"name": "intervention_status", "type": "string", "description": "pending/active/completed/saved/churned", "required": false},
{"name": "days_to_renewal", "type": "number", "description": "Days until next renewal", "required": true},
{"name": "acv", "type": "number", "description": "Annual contract value in USD", "required": true},
{"name": "segment", "type": "string", "description": "SMB/mid-market/enterprise", "required": true},
{"name": "tenure_days", "type": "number", "description": "Days since initial signup", "required": true}
],
"expected_row_count": "all active customers",
"sort_order": "health_score ascending (worst first)",
"deduplication_key": "customer_id"
}
| Quality Metric | Minimum Acceptable | Good | Excellent |
|---|---|---|---|
| Health score accuracy (predicted vs actual churn) | >60% correlation | >75% | >85% |
| Involuntary churn recovery rate | >40% | >55% | >68-70% |
| Proactive intervention coverage (at-risk reached) | >50% | >75% | >90% |
| Save rate for at-risk accounts | >30% | >45% | >60% |
| Time: health decline to first intervention | <14 days | <7 days | <3 days |
| GRR improvement vs baseline (90 days) | +2 pp | +5 pp | +10 pp |
| NRR improvement vs baseline (180 days) | +3 pp | +7 pp | +12 pp |
If below minimum: Recalibrate health score weights; if intervention coverage is low, reduce trigger thresholds or add automated interventions for yellow-tier accounts; if save rate is low, review playbook execution quality. [src4]
| Error | Likely Cause | Recovery Action |
|---|---|---|
| Health scores cluster at one end (>80% green or >50% red) | Threshold calibration wrong; signals not differentiating | Back-test against 6 months of actual churn; adjust thresholds to 60/25/15 distribution |
| No correlation between health score and actual churn | Wrong signals or data quality issues | Re-run Step 1; replace lowest-correlation signals; run logistic regression to find actual predictive variables |
| CS platform integration fails with CRM | API auth error or field mapping mismatch | Check credentials; verify required fields; use CSV export as temporary bridge |
| Dunning recovery rate below 30% | Email deliverability or poor timing | Test delivery with seed list; adjust timing to payroll cycles; add SMS + in-app; consider dedicated dunning tool |
| Playbooks not executed by CS team | Too complex, alert fatigue, or unclear ownership | Simplify to 3-step playbooks; reduce to top 5 triggers; add to daily standup; assign clear owners |
| Cohort analysis shows no patterns | Insufficient data or wrong segmentation | Increase cohort size by merging months; try alternative dimensions (channel, tier, geography) |
| Dashboard data stale or disconnected | API rate limits, credential expiry, or pipeline failure | Set up freshness alerts; add manual fallback; check API keys; verify ETL pipeline |
| Component | Free/Spreadsheet | Starter ($6K-$15K) | Mid-Market ($20K-$60K) | Enterprise ($60K+) |
|---|---|---|---|---|
| CS platform | Google Sheets ($0) | Totango/Vitally ($6K-$15K) | ChurnZero ($15K-$26K) | Gainsight ($60K-$140K) |
| Product analytics | Mixpanel Free ($0) | Mixpanel Free ($0) | Mixpanel Growth ($3K-$30K) | Enterprise ($25K-$100K) |
| Dunning recovery | Stripe Smart Retries ($0) | Churnkey ($1.2K-$6K) | Recurly ($3K-$12K) | Custom ($12K-$36K) |
| BI/Dashboard | Google Sheets ($0) | Metabase OSS ($0) | Looker/Tableau ($5K-$15K) | Looker Enterprise ($15K-$50K) |
| Implementation | 2-4 weeks (DIY) | 4-6 weeks | 6-8 weeks | 3-6 months |
| Total Year 1 | $0-$500 | $7K-$21K | $26K-$83K | $112K-$326K |
| Expected ARR saved | $20K-$100K | $100K-$500K | $500K-$3M | $3M-$20M |
| Typical ROI | 5-10x | 7-15x | 10-20x | 15-30x |
Blanket retention efforts (mass discounts, generic "we miss you" emails) waste budget on healthy customers and under-serve at-risk ones. Enterprise churn drivers differ fundamentally from SMB. [src1]
Match each segment x root-cause combination to a specific playbook with documented save economics. Nine playbooks (3 root causes x 3 tiers) is the minimum viable coverage.
Discounting without fixing the root problem delays churn by one cycle at lower margin. Discount-saved customers churn at 2-3x the rate of problem-solved customers in the following year. [src2]
Save hierarchy: solve the problem → offer scope reduction → discount. Discounts are for genuine budget-only objections.
Over-engineered models are harder to maintain, explain to CS teams, and debug. Teams spend months building a "perfect" model that never ships. [src4]
Deploy a simple model in week 3, measure accuracy over 90 days, then add signals. A logistic regression on 5 variables gets 80% of a 30-variable ML model's predictive power.
Involuntary churn is 20-40% of total and the cheapest to fix. Smart retry alone recovers 70% of failed payments. Pre-expiry alerts prevent 40%+ of failures. [src5]
Fix the leak that costs nothing to fix first. A 10-pp improvement in dunning recovery often equals $50K-$500K in recovered ARR with zero marginal cost.
Use when a company has 100+ customers, at least 6 months of data, and needs to build or rebuild a systematic retention program — not plan one, but actually deploy health scoring, configure smart dunning, activate intervention playbooks, and measure results on a live dashboard. Requires lifecycle data, product usage, and support history as inputs; produces an operational retention system as output. A 10-point NRR improvement boosts valuation by 20-30%. [src6]