Technology Scaling Assessment

When should I re-architect from MVP — technical debt assessment, database scaling, infrastructure migration?

Purpose

This recipe produces a scored technical debt assessment, migration decision matrix, and phased architecture evolution roadmap — the three deliverables a startup CTO needs to decide when, what, and how to re-architect from MVP. The output answers the critical question: is the current architecture a growth bottleneck or still serviceable, and what is the cost/benefit of each migration path. [src1]

Prerequisites

Codebase access — git repository with at least 6 months of commit history for trend analysis
CI/CD pipeline data — deployment logs showing frequency, success/failure rates, build times
Infrastructure monitoring — CPU/memory/disk utilization, request latency P50/P95/P99, error rates (last 90 days)
Incident history — postmortem records or incident tracker data (PagerDuty, OpsGenie)
Static analysis tool — SonarQube or Code Climate configured

Constraints

Never re-architect and add features simultaneously — one major initiative at a time [src3]
Database migrations must be backward-compatible — use expand/contract pattern [src4]
Infrastructure costs increase 2-4x during migration before settling to 1.2-1.5x post-migration [src5]
Re-architecture takes 3-9 months for typical Series A startups
Minimum 80% test coverage on critical paths before starting migration [src1]

Tool Selection Decision

Which assessment depth?
├── Quick health check (1 day) — DORA metrics + code quality scan
│   └── PATH A: Automated Scan — SonarQube + GitHub analytics
├── Standard assessment (2-3 days) — Full 10-point scoring + bottleneck analysis
│   └── PATH B: Manual + Automated — Code review + monitoring + DORA
├── Deep architecture review (4-5 days) — Full assessment + migration planning
│   └── PATH C: Architecture Audit — PATH B + load testing + prototyping
└── Pre-investment due diligence (5-7 days) — Investor-grade assessment
    └── PATH D: Full Due Diligence — PATH C + security + compliance

Path	Tools	Cost	Duration	Output Depth
A: Quick Scan	SonarQube, GitHub Analytics	$0	1 day	Scorecard only
B: Standard	SonarQube, Datadog/Grafana, manual review	$0-150	2-3 days	Scorecard + bottleneck map
C: Deep Review	All B + k6/Locust, diagramming	$0-300	4-5 days	Full assessment + migration plan
D: Due Diligence	All C + security scanner	$200-500	5-7 days	Investor-grade report

Execution Flow

Step 1: Extract DORA Metrics

Duration: 2-4 hours · Tool: GitHub/GitLab analytics, CI/CD logs

Collect the four DORA metrics from the last 90 days — deployment frequency, lead time for changes, change failure rate, and mean time to recovery. [src2]

# GitHub: extract deployment frequency (last 90 days)
gh api repos/{owner}/{repo}/actions/runs --paginate \
  --jq '[.workflow_runs[] | select(.conclusion=="success")] | length'

Verify: All four metrics have values from at least 30 data points · If failed: Use git log analysis as proxy

Step 2: Run Static Code Analysis (10-Point Assessment)

Duration: 1-3 hours · Tool: SonarQube / Code Climate

Score across 10 dimensions: code quality, test coverage, dependency freshness, build reliability, deploy velocity, incident rate, database health, API latency, security posture, documentation. Total score /100. [src1]

Verify: Score < 50 = urgent re-architecture. 50-70 = planned remediation. > 70 = maintain and iterate · If failed: Use ESLint/Pylint for basic quality metrics

Step 3: Database Scaling Assessment

Duration: 2-4 hours · Tool: pg_stat_statements, EXPLAIN ANALYZE

Identify top 10 slow queries, assess index coverage, measure connection utilization, and map database scaling ceiling. The most common MVP scaling wall is the database. [src6]

Verify: Top 10 slow queries identified, index coverage assessed · If failed: Use EXPLAIN ANALYZE on known slow endpoints

Step 4: Infrastructure Bottleneck Analysis

Duration: 2-4 hours · Tool: Cloud monitoring (CloudWatch / Datadog / Grafana)

Map CPU, memory, disk I/O, network utilization, request latency (P50/P95/P99), and error rates over 30 days. Identify which resource hits ceiling first under growth projections.

Verify: All resource metrics collected, peak utilization periods identified · If failed: Use system tools (top, iostat) during peak hours

Step 5: Build Migration Decision Matrix

Duration: 4-8 hours · Tool: Spreadsheet / document

Score four options — optimize in place, strangler fig migration, parallel rebuild, platform migration — across five weighted factors: time to impact, engineering cost, regression risk, long-term scalability, team disruption. [src3] [src4]

Verify: All options scored, recommendation selected with confidence level · If failed: Flag data gaps, recommend collection period before deciding

Step 6: Produce Architecture Evolution Roadmap

Duration: 4-8 hours · Tool: Document + diagramming (Mermaid, draw.io)

Create phased roadmap: Phase 0 (Foundation — monitoring + tests), Phase 1 (Quick Wins — indexes + pooling + CDN), Phase 2 (Service Extraction — strangler fig), Phase 3 (Data Layer Evolution — replicas + caching), Phase 4 (Operational Excellence — SLOs + chaos engineering). Each phase has quality gates.

Verify: Roadmap has specific milestones, gates, resource requirements, and budget per phase · If failed: Produce high-level timeline and iterate with team input

Output Schema

{
  "output_type": "technical_scaling_assessment",
  "format": "structured document bundle",
  "sections": [
    {"name": "dora_metrics", "type": "object", "description": "Four DORA metrics with scores and benchmarks"},
    {"name": "code_quality_score", "type": "number", "description": "Aggregate score 0-100 from 10-point assessment"},
    {"name": "database_assessment", "type": "object", "description": "Slow queries, index coverage, scaling ceiling"},
    {"name": "infrastructure_utilization", "type": "object", "description": "Resource heatmap with bottleneck identification"},
    {"name": "migration_recommendation", "type": "string", "description": "Selected option with confidence"},
    {"name": "roadmap_phases", "type": "array", "description": "Ordered phases with gates and timelines"}
  ],
  "expected_deliverables": "3-5 documents + 2 diagrams"
}

Quality Benchmarks

Quality Metric	Minimum Acceptable	Good	Excellent
DORA metrics coverage	All 4 metrics measured	90-day trend data	12-month trend with seasonality
Code quality data points	SonarQube scan completed	10-point assessment scored	Trend comparison with previous quarter
Database assessment depth	Top 10 slow queries identified	Index coverage + connection analysis	Load test results at 2x/5x/10x
Roadmap specificity	Phase names and rough timeline	Specific milestones and gates	Resource allocation + budget per phase

If below minimum: Re-run data collection with broader time window, or engage external DevOps consultant.

Error Handling

Error	Likely Cause	Recovery Action
pg_stat_statements not available	Extension not enabled	Run CREATE EXTENSION pg_stat_statements; wait 24h
SonarQube scan fails	Memory limit exceeded	Increase Docker memory to 4GB
Cloud metrics API empty	Monitoring not configured	Verify monitoring agent, check region
Git history too short	New repo or squash-merged	Use PR merge data as proxy
Load test crashes production	Ran against production	Always target staging environment

Cost Breakdown

Component	Free Tier	Paid Tier	At Scale
Code analysis (SonarQube)	$0 (community)	$150/mo (cloud)	$400/mo (enterprise)
Monitoring (Datadog/Grafana)	$0 (Grafana OSS)	$23/host/mo	$50/host/mo
Load testing (k6/Locust)	$0 (self-hosted)	$0 (CLI)	$600/mo (cloud)
Engineering time	2 days	3-5 days	5-7 days
Total	$0 + 2 eng-days	$150-300 + 3-5 eng-days	$500-1000 + 5-7 eng-days

Anti-Patterns

Wrong: Big Bang Rewrite

Stopping all feature development to rebuild from scratch. 60-80% of big bang rewrites fail or overrun timeline because hidden complexity is underestimated. [src3]

Correct: Strangler Fig Approach

Incrementally replace components while the old system serves traffic. Route new functionality through the new system, migrate piece by piece. [src4]

Wrong: Premature Microservices

Splitting a monolith before the team has operational maturity. A 3-5 person team managing 15 microservices spends more time on infrastructure than features. [src6]

Correct: Modular Monolith First

Refactor into well-defined modules with clean boundaries. Extract to services only when modules have genuinely different scaling or ownership requirements.

When This Matters

Use this recipe when a startup CTO or technical founder needs evidence-based data to decide whether to re-architect from MVP — specifically when feature velocity is declining, reliability is suffering, or the team spends more time on workarounds than new capabilities.