This recipe refactors a working Signal Stack vertical #1 into a reusable generic engine plus a declarative configuration layer. The extraction separates the 5 generic components (ingestion framework, classification pipeline, enrichment engine, document generator, delivery/tracking) from vertical-specific config (sources, triggers, targets, templates, conversion definitions). Target: each subsequent vertical requires < 50% of the effort of vertical #1. [src1, src2]
Which path?
├── Pipeline is a monolithic Python script
│ └── PATH A: Strangler Fig — wrap existing code, extract incrementally
├── Pipeline is partially modular
│ └── PATH B: Interface Extraction — define contracts, refactor in place
├── Pipeline is modular but tightly coupled
│ └── PATH C: Config Extraction — extract hard-coded values into config
└── Starting fresh (rare — unmaintainable code only)
└── PATH D: Rewrite with config-first architecture
| Path | Approach | Duration | Risk | Best For |
|---|---|---|---|---|
| A: Strangler Fig | Incremental wrapping | 4-5 weeks | Low | Monolithic pipelines |
| B: Interface Extraction | Define contracts, refactor | 3-4 weeks | Medium | Partially modular |
| C: Config Extraction | Pull values into config | 2-3 weeks | Low | Already modular |
| D: Rewrite | Build new, migrate | 5-8 weeks | High | Unmaintainable code only |
Duration: 2-3 days · Tool: Code review + architecture diagramming
Map current pipeline to the 5-layer architecture. For each layer, identify generic vs. vertical-specific components. Produce architecture diagram with annotated vertical assumptions. [src1]
Verify: Architecture diagram reviewed, every hard-coded vertical assumption annotated. · If failed: Pair with pipeline engineer to trace data flow end-to-end.
Duration: 3-5 days · Tool: JSON Schema or Pydantic model
Design the declarative config schema covering: sources, triggers, targets, templates, delivery, and conversion definitions. Validate by expressing vertical #1 entirely as config. [src3, src4]
Verify: Vertical #1 fully expressible as config + generic engine. · If failed: Extend schema or accept behavior as generic engine feature.
Duration: 5-8 days · Tool: Git branching + incremental refactoring
Extract layer by layer: ingestion framework, classification pipeline, enrichment engine, document generator, delivery/tracking. Use Strangler Fig pattern — wrap existing code, replace internals gradually. [src1, src4]
Verify: Engine runs vertical #1 from config file, output matches pre-extraction exactly. · If failed: Diff outputs, fix discrepancies before proceeding.
Duration: 2-3 days · Tool: Automated test suite
Run 20+ known-good examples through extracted engine + config. Compare classification, enrichment, dossier structure, and delivery behavior.
Verify: 100% pass on classification/enrichment, > 95% structural match on dossiers. · If failed: Add missing config parameters for failing edge cases.
Duration: 3-5 days · Tool: New config file + test data
Create vertical #2 config, run against 20-50 test signals. Measure: config creation time (< 3 days), engine code changes (zero), config coverage (> 90%). This is the extraction validation gate. [src1, src2]
Verify: Vertical #2 operational with zero engine changes. · If failed: Refactor affected component, re-run dry run.
Duration: 2-3 days · Tool: Documentation + walkthrough
Produce platform architecture document: component diagram, config schema reference, new-vertical guide, deployment topology, extension points. [src3]
Verify: Team member creates basic vertical config from docs alone. · If failed: Annotate documentation gaps and fill them.
{
"output_type": "platform_extraction_package",
"format": "code repository + documentation",
"sections": [
{"name": "generic_engine", "type": "object", "description": "5-layer reusable pipeline codebase"},
{"name": "config_schema", "type": "object", "description": "JSON/YAML schema for vertical config"},
{"name": "vertical_1_config", "type": "object", "description": "Existing vertical as config"},
{"name": "vertical_2_config", "type": "object", "description": "New vertical dry-run config"},
{"name": "regression_results", "type": "object", "description": "Test suite pass/fail report"},
{"name": "architecture_doc", "type": "object", "description": "Platform architecture and config guide"}
]
}
| Quality Metric | Minimum Acceptable | Good | Excellent |
|---|---|---|---|
| Regression pass rate | > 95% | > 98% | 100% |
| Vertical #2 config time | < 5 days | < 3 days | < 2 days |
| Engine code changes for V2 | < 5 changes | 1-2 changes | Zero |
| Config coverage | > 85% | > 90% | > 95% |
| V1 client disruption | < 1 hour downtime | Zero downtime | Zero disruption |
| Doc completeness | With help | Independent + questions | Fully independent |
If below minimum: Abstraction boundary is wrong. Pause vertical #2 launch, refactor engine, re-test.
| Error | Likely Cause | Recovery Action |
|---|---|---|
| Regression failures | Config schema missing edge cases | Diff outputs, add missing config parameters |
| V2 requires engine changes | Abstraction boundary too narrow | Generalize component, add config parameter |
| Config schema too complex (> 200 fields) | Over-engineering | Simplify: 20% config covers 80% variation |
| V1 clients report issues | Backward compatibility regression | Immediate rollback, fix, re-deploy |
| Team cannot create config from docs | Documentation gaps | Pair-program, annotate stumbling points, fill gaps |
| Component | Solo ($10K) | Small Team ($15K) | Dedicated ($25K) |
|---|---|---|---|
| Architecture audit | $1K | $2K | $3K |
| Config schema design | $2K | $3K | $4K |
| Engine extraction | $4K | $5K | $8K |
| Regression testing | $1K | $2K | $3K |
| Vertical #2 dry run | $1K | $2K | $4K |
| Documentation | $1K | $1K | $3K |
| Total | $10K | $15K | $25K |
Building generic infrastructure before proving the vertical commercially. Result: you optimize for flexibility nobody needs while the product stagnates. [src1]
Wait for 3 paying customers. Their usage patterns reveal what actually needs to be generic vs. one-off.
Creating a 200+ field config schema. Result: config becomes as complex as code, nobody can create a vertical without the architect. [src3]
Identify the 20% of variation that covers 80% of use cases. Accept that rare edge cases may require small code additions.
Rewriting from scratch. Result: months with no features, diverging codebases, scope creep. [src4]
Wrap existing code behind new interfaces. Extract one layer at a time. Zero downtime, zero risk.
Use when an agent needs to plan or execute the transition from a working single-vertical pipeline to a reusable platform. This is the critical inflection point: done right, it enables rapid vertical expansion at marginal cost. Done wrong, it kills momentum and wastes engineering time.