Evaluates the soundness of a software system's technical architecture across six dimensions: scalability, SPOF resilience, CI/CD maturity, incident response, observability, and code quality. Produces a structured diagnostic identifying where architecture enables or constrains the business. [src1]
What this measures: System's ability to handle increasing load through horizontal or vertical scaling while maintaining acceptable latency.
| Score | Level | Description | Evidence |
|---|---|---|---|
| 1 | Ad hoc | No capacity planning; system fails under moderate spikes | No load testing; single server; p95 unknown |
| 2 | Emerging | Basic vertical scaling; known bottlenecks unaddressed | Manual scaling; shared DB; p95 > 2s |
| 3 | Defined | Horizontal scaling for stateless services; load testing in release cycle | Auto-scaling configured; DB read replicas; p95 < 500ms |
| 4 | Managed | Architecture designed for 10x load; data-driven capacity planning | Proven 5x spike handling; multi-region; p95 < 200ms |
| 5 | Optimized | Elastic architecture handles 100x spikes; cost-optimized scaling | Real-time auto-scaling; sub-linear cost growth |
Red flags: No one knows current p95; outages during traffic spikes; single-instance database. [src3]
Quick diagnostic question: "What happens if traffic triples tomorrow — when did you last test that?"
What this measures: Whether single points of failure have been identified and mitigated across infrastructure, data, dependencies, and people.
| Score | Level | Description | Evidence |
|---|---|---|---|
| 1 | Ad hoc | No SPOF analysis; critical systems on single instances; bus factor of 1 | Single DB server; no failover; no dependency mapping |
| 2 | Emerging | Some SPOFs identified but not mitigated; basic backups | Backups exist but untested; partial redundancy |
| 3 | Defined | Formal SPOF audit; critical path redundancy; DR plan tested annually | Failover for critical services; bus factor >= 2; backup restoration tested quarterly |
| 4 | Managed | Chaos engineering active; automated failover; quarterly DR tests | Chaos experiments monthly; failover < 60s; RTO < 4h |
| 5 | Optimized | Self-healing infrastructure; multi-region active-active; zero customer impact from failures | 80%+ automated remediation; active-active 2+ regions |
Red flags: No SPOF analysis done; single DB with no replication; single engineer owns critical subsystem. [src4]
Quick diagnostic question: "If your primary database dies right now, what happens and how long until recovery?"
What this measures: Speed, safety, and automation of the delivery pipeline — code commit to production, benchmarked against DORA metrics.
| Score | Level | Description | Evidence |
|---|---|---|---|
| 1 | Ad hoc | Manual deployments; risky all-day events; < monthly releases | No automated testing; change failure > 45%; lead time > 6 months |
| 2 | Emerging | Basic CI; semi-automated deploys; monthly-ish releases | Automated build only; change failure 30-45%; lead time 1-6 months |
| 3 | Defined | Full CI/CD; automated testing; weekly-daily deploys; feature flags | 60%+ coverage; change failure 15-30%; lead time 1w-1m; rollback documented |
| 4 | Managed | Continuous deployment with canary/blue-green; multiple deploys/day | < 15% change failure; lead time 1d-1w; MTTR < 1h; 80%+ coverage |
| 5 | Optimized | On-demand zero-downtime deployment; progressive delivery; automated quality gates | < 5% change failure; lead time < 1 day; deploy confidence > 99% |
Red flags: Deployments only when specific people available; no automated tests; deployment freezes beyond holidays; rollback = restore from backup. [src1]
Quick diagnostic question: "How often do you deploy, and what percentage requires a hotfix or rollback?"
What this measures: Maturity of processes for detecting, responding to, and learning from production incidents.
| Score | Level | Description | Evidence |
|---|---|---|---|
| 1 | Ad hoc | Customers find issues first; no process; no postmortems | No on-call; no alerting; MTTD in hours/days |
| 2 | Emerging | Basic on-call; some alerting; reactive handling | Informal on-call; postmortems for major incidents only |
| 3 | Defined | Structured process; severity levels; blameless postmortems standard | Runbooks for top 10 failures; MTTD < 15min for P1; action items tracked |
| 4 | Managed | Distributed incident response; automated detection; SLOs with error budgets | Automated incident creation; quarterly trend review; 90%+ action items completed |
| 5 | Optimized | Proactive prevention; automated remediation; predictive alerting | 60%+ alerts auto-remediated; incident rate declining QoQ |
Red flags: No on-call; customers report outages first; same incident type recurs monthly. [src5]
Quick diagnostic question: "When was your last production incident, how did you find out, and what did you change?"
What this measures: Ability to understand system behavior through logs, metrics, traces, and dashboards.
| Score | Level | Description | Evidence |
|---|---|---|---|
| 1 | Ad hoc | No centralized logging; basic CPU/memory metrics only | Console.log debugging; SSH into production; no dashboards |
| 2 | Emerging | Centralized logging; basic monitoring; some app metrics | Log aggregation; 1-2 dashboards; no distributed tracing |
| 3 | Defined | Structured logging; APM deployed; SLI-based alerting | Three pillars (logs, metrics, traces); custom dashboards per team |
| 4 | Managed | Full distributed tracing; anomaly detection; observability as code | End-to-end tracing; < 5% false positive rate; Terraform-managed |
| 5 | Optimized | AI-assisted root cause analysis; predictive observability | AIOps correlation; sub-minute root cause ID; cost-optimized telemetry |
Red flags: Engineers SSH into production to debug; no one can answer current p95; monitoring is infrastructure-only. [src3]
Quick diagnostic question: "If a user reports slowness, what tool does your engineer open first and how fast can they find root cause?"
What this measures: Structural health of the codebase — technical debt, test coverage, documentation, and onboarding speed.
| Score | Level | Description | Evidence |
|---|---|---|---|
| 1 | Ad hoc | No standards; no review; < 20% test coverage; high coupling | No linting; no PRs; no docs; every change risks regressions |
| 2 | Emerging | Basic standards; inconsistent review; 20-40% coverage | Linting configured; some unit tests; onboarding 2-3 months |
| 3 | Defined | Enforced standards; mandatory review; 40-70% coverage; ADRs | 2-reviewer minimum; integration tests; onboarding < 30 days |
| 4 | Managed | Automated quality gates; > 70% coverage; tech debt budgeted | SonarQube gates in CI; quarterly debt reduction; clear module boundaries |
| 5 | Optimized | > 85% coverage; architecture enforces modularity; continuous refactoring | Mutation testing; onboarding < 2 weeks; health metrics trending positive |
Red flags: No code review; "only one person understands this"; test suite hours long or skipped; engineers afraid to refactor. [src6]
Quick diagnostic question: "How confident is your team making a significant change to a core module without regression?"
Formula: Overall Score = (Scalability + SPOF Resilience + CI/CD + Incident Response + Observability + Code Quality) / 6
| Overall Score | Level | Interpretation | Next Step |
|---|---|---|---|
| 1.0 - 1.9 | Critical | Architecture is a business risk — outages and slow delivery constrain growth | Address lowest dimension; invest in CI/CD and observability first |
| 2.0 - 2.9 | Developing | Basic systems in place but significant gaps; breaks at 3-5x scale | Close biggest gap; priority: observability > CI/CD > incident response |
| 3.0 - 3.9 | Competent | Solid foundation with room for optimization | Invest in chaos engineering, SLOs, progressive delivery |
| 4.0 - 4.5 | Advanced | Engineering enables, not bottlenecks; mature DevOps | Fine-tune cost efficiency; advanced observability; platform capabilities |
| 4.6 - 5.0 | Best-in-class | Architecture is a competitive advantage | Maintain; evaluate emerging paradigms |
| Weak Dimension (Score < 3) | Fetch This Card |
|---|---|
| Scalability & Performance | Cloud Migration Playbook |
| SPOF Resilience | Business Continuity Planning |
| CI/CD & Deployment | Technology Stack Decision Framework |
| Incident Response | Cyber Risk Quantification |
| Observability | Technology Stack Decision Framework |
| Code Quality | Technology Stack Decision Framework |
| Segment | Expected Average | "Good" Threshold | "Alarm" Threshold |
|---|---|---|---|
| Seed/Series A (1-5 eng) | 1.8 | 2.5 | 1.2 |
| Series B (6-20 eng) | 2.8 | 3.3 | 2.0 |
| Growth (21-50 eng) | 3.5 | 4.0 | 2.8 |
| Scale/Public (50+ eng) | 4.2 | 4.5 | 3.5 |
| Metric | Elite | High | Medium | Low |
|---|---|---|---|---|
| Deployment Frequency | On-demand (multiple/day) | Daily to weekly | Weekly to monthly | < monthly |
| Lead Time for Changes | < 1 day | 1 day - 1 week | 1 week - 1 month | 1 - 6 months |
| Change Failure Rate | < 5% | 5-15% | 15-30% | 30-45% |
| Mean Time to Recovery | < 1 hour | < 1 day | 1 day - 1 week | > 1 week |
Only 16.2% achieve on-demand deployment; 23.9% deploy less than monthly. [src2]
Fetch when a user asks to evaluate engineering architecture, prepare for technical due diligence (fundraising, acquisition), diagnose declining delivery velocity, assess readiness for a scaling phase, or onboard a new CTO/VP Engineering needing a baseline.