Pilot Execution Playbook — Signal Stack

Type: Execution Recipe Confidence: 0.85 Sources: 5 Verified: 2026-03-29

Purpose

This recipe executes a Signal Stack pilot that delivers 10-20 qualified dossiers per week to 2-3 pilot customers. The pilot validates that signal-driven outreach produces measurably higher conversion than cold outreach — the core thesis that 95% of any market is not buying at any moment, but the 5% in active crisis are detectable through observable corporate distress signals. Phase 1 exit criteria: 3 paying customers in vertical #1. [src1, src3]

Prerequisites

Constraints

Tool Selection Decision

Which path?
├── Technical team available (can write Python)
│   ├── Budget > $500/month
│   │   └── PATH A: Full API stack — LLM + Clearbit/Apollo + Resend
│   └── Budget < $500/month
│       └── PATH B: LLM + free enrichment + manual delivery
├── No-code team
│   ├── Budget > $500/month
│   │   └── PATH C: Make/n8n + Clay + LLM API
│   └── Budget < $500/month
│       └── PATH D: Manual pipeline — Google Alerts + ChatGPT + email
└── Hybrid (one developer + ops person)
    └── PATH E: Python scrapers + LLM API + manual QA
PathToolsCost/monthSpeedOutput Quality
A: Full APIPython + Claude/GPT-4 + Clearbit + Resend$500-80010-20/weekExcellent
B: Budget APIPython + LLM + Apollo free + manual$200-40010-15/weekGood
C: No-codeMake/n8n + Clay + LLM API$400-7008-15/weekGood
D: ManualGoogle Alerts + ChatGPT + manual$20-505-10/weekAdequate
E: HybridPython + LLM API + manual QA$300-50010-20/weekExcellent

Execution Flow

Step 1: Baseline Measurement

Duration: 2-3 days · Tool: CRM export + spreadsheet analysis

Collect current outreach performance metrics from each pilot customer: total outreach volume, open rate, reply rate, meeting-booked rate, close rate, average deal size, and cost per meeting.

Verify: Baseline spreadsheet complete for all pilot customers with 3+ months of historical data. · If failed: Use industry benchmarks (15-25% open, 1-3% reply, 0.5-1% meeting rate).

Step 2: Pipeline Calibration

Duration: 3-5 days · Tool: Python scripts + LLM API + test data

Run the pipeline on a test batch of 50-100 signal events. Classify each as true positive, false positive, or ambiguous. Calculate initial precision rate and calibrate classification thresholds. [src1]

Verify: Precision rate > 60% on test batch. · If failed: Return to taxonomy workshop, add exclusion rules for top 3 false positive patterns.

Step 3: First Dossier Batch (Week 1)

Duration: 5 days (ongoing weekly) · Tool: Full pipeline + human review

Generate 10-20 dossiers containing signal evidence, company profile, decision-maker identification, tailored outreach copy, and proof pack. Human-in-the-loop review scores each dossier 1-5 on accuracy, completeness, relevance, and proof quality. [src2]

Verify: At least 10 dossiers pass quality review (score >= 3). · If failed: Pause delivery, tighten classification, generate new batch.

Step 4: A/B Test Package Formats

Duration: 2 weeks (parallel to delivery) · Tool: Email delivery with variant tracking

Split delivery into 2-3 format variants: full PDF dossier, executive summary email, data-only alert. Track open/reply/meeting rates per variant. [src3]

Verify: Statistical significance on at least one metric after 2 weeks. · If failed: Extend test or reduce to 2 variants.

Step 5: Weekly Taxonomy Iteration

Duration: 2-4 hours/week (ongoing) · Tool: Spreadsheet analysis + taxonomy update

Review all signals weekly. Classify outcomes as true positive, unknown, or false positive. Update taxonomy rules based on false positive analysis. Target: < 30% false positive rate by week 4. [src1, src2]

Verify: False positive rate decreasing week-over-week. · If failed: Replace weakest signal source or add corroborating signal requirement.

Step 6: Conversion Tracking

Duration: Ongoing (weekly) · Tool: CRM integration or manual tracking

Track full funnel: dossier sent → opened → replied → meeting → proposal → close. Compare against baseline. Key metric: > 2x conversion vs. cold outreach baseline. [src3, src5]

Verify: Conversion data for 80%+ of delivered dossiers by week 4. · If failed: Add manual follow-up calls to capture outcomes.

Step 7: Phase 1 Exit Assessment

Duration: 1-2 days · Tool: Analysis + presentation

Compile exit assessment: signal accuracy, conversion vs. baseline, customer satisfaction, unit economics, go/no-go for platform extraction. Phase 1 exit criteria: 3 paying customers in vertical #1. [src1, src2]

Verify: Exit assessment with data-backed recommendation complete. · If failed: Extend pilot by 4 weeks or reassess vertical selection.

Output Schema

{
  "output_type": "pilot_performance_report",
  "format": "spreadsheet + PDF summary",
  "sections": [
    {"name": "baseline_metrics", "type": "object", "description": "Pre-pilot outreach performance per customer"},
    {"name": "weekly_dossier_batches", "type": "array", "description": "Dossier count, quality scores, delivery per week"},
    {"name": "signal_accuracy", "type": "object", "description": "Precision rate, false positive rate, taxonomy iterations"},
    {"name": "conversion_funnel", "type": "object", "description": "Open/reply/meeting/close rates vs baseline"},
    {"name": "ab_test_results", "type": "object", "description": "Package format variant performance"},
    {"name": "exit_assessment", "type": "object", "description": "Go/no-go for platform extraction"}
  ]
}

Quality Benchmarks

Quality MetricMinimum AcceptableGoodExcellent
Dossier volume (per week)>= 10>= 15>= 20
Signal precision rate> 60%> 75%> 85%
Dossier quality score (avg)> 3.0/5> 3.5/5> 4.0/5
Conversion vs. baseline> 1.5x> 2x> 3x
False positive trendFlatDecreasing< 20% by week 4
Pilot customer retention2/3 continue3/3 continue3/3 convert to paid

If below minimum: Pause delivery, return to taxonomy calibration, extend pilot by 2 weeks.

Error Handling

ErrorLikely CauseRecovery Action
Precision < 60% after calibrationTaxonomy too broad or sources too noisyNarrow to top 2-3 sources, add compound signal requirement
Pilot customer stops respondingDossier quality too lowDirect outreach, request feedback, pivot contact
Volume < 10/weekLow event frequency in verticalAdd signal sources or broaden geographic scope
A/B test inconclusiveVolume too lowExtend test period or reduce variants
Compliance flagPECR/CAN-SPAM violationPause delivery, review compliance, switch to opt-in only
LLM classification degradesPrompt drift or model updateRe-calibrate prompts, pin model version

Cost Breakdown

ComponentBudget ($5K)Standard ($10K)Premium ($15K)
Signal Architect labor$2K$4K$6K
LLM API costs$500$1K$1.5K
Enrichment APIs$300$600$1K
Delivery infrastructure$200$400$500
Domain advisor$0$2K$3K
QA and iteration$1K$2K$3K
Total (4-week pilot)$4K-$5K$10K$15K

Anti-Patterns

Wrong: Optimizing for volume over quality in week 1

Pushing 20+ dossiers before signal accuracy is validated. Result: pilot customers receive irrelevant dossiers, lose trust, and the pilot fails. [src2]

Correct: Cap at 10 dossiers week 1, scale after quality is validated

Deliver 10 human-reviewed dossiers in week 1. Only increase volume after customer confirms relevance on >= 7/10.

Wrong: Skipping baseline measurement

Starting without documenting current outreach performance. Result: you cannot prove the 2x conversion claim. [src3]

Correct: Measure before you move

Spend 2-3 days collecting 3+ months of historical outreach data before delivering a single dossier.

Wrong: Not iterating on the taxonomy weekly

Treating the initial taxonomy as fixed. Result: false positive rate stays high, quality plateaus, customers churn. [src1]

Correct: Weekly taxonomy iteration is non-negotiable

Review every false positive weekly. Update classification rules and document changes. The taxonomy should visibly improve each week.

When This Matters

Use when an agent needs to execute or plan a Signal Stack pilot engagement. This is the hands-on delivery recipe — it takes a completed signal audit and taxonomy and turns them into measurable results. The pilot is the validation gate: if it does not produce 3 paying customers with > 2x conversion, do not proceed to platform extraction.

Related Units