Engineering Productivity Benchmarks 2026
Summary
Comprehensive engineering productivity benchmarks covering DORA metrics, cycle time, PR metrics, developer experience, and quality indicators by team size and AI adoption. The 2025 DORA report replaced elite/low performer tiers with profile clusters and added a fifth metric. AI improved throughput 30-40% but increased delivery instability. [src1]
Data vintage: H2 2025 data from 30,000+ professionals (DORA) and 6.1M+ pull requests (LinearB).
Key shift: AI is a double-edged sword — faster output but higher change failure rates. Old "elite performer" framework is officially dead.
Constraints
- Software engineering teams at tech companies. Not for embedded systems or hardware.
- Small teams (1-10) show 15-20% better per-engineer metrics. Normalize by team size.
- AI tools create bimodal distribution: 30-40% better cycle time but 15-25% higher CFR.
- Legacy DORA tiers (elite/high/medium/low) are outdated. 2025 uses profile clusters.
- AI-driven benchmarks shifting faster than any previous tooling transition.
Metrics
DORA Metrics
Deployment Frequency
Definition: How often code deploys to production.
| Profile | Frequency | % of Orgs |
|---|---|---|
| On-demand | Multiple times/day | 16.2% |
| Daily to weekly | 1/day to 1/week | 21.9% |
| Weekly to monthly | 1/week to 1/month | 28.4% |
| Monthly to quarterly | 1/month to 1/quarter | 22.8% |
| Infrequent | < 1/quarter | 10.7% |
Lead Time for Changes by Team Size
| Team Size | Median | 75th Pct | Top Decile |
|---|---|---|---|
| Small (1-10) | 2.9 days | 1.2 days | < 6 hours |
| Medium (11-50) | 3.8 days | 2.1 days | < 12 hours |
| Large (51-200) | 5.2 days | 3.5 days | < 1 day |
| Enterprise (200+) | 8.5 days | 5.0 days | < 2 days |
Change Failure Rate
| Range | CFR | % of Teams |
|---|---|---|
| Excellent | < 2% | 8.5% |
| Good | 2-8% | 24.3% |
| Moderate | 8-16% | 26.0% |
| High | 16-30% | 25.7% |
| Critical | > 30% | 15.5% |
Failed Deployment Recovery Time
| Profile | Recovery Time | % of Teams |
|---|---|---|
| Fast | < 1 hour | 12.8% |
| Good | 1-4 hours | 18.5% |
| Moderate | 4-24 hours | 25.4% |
| Slow | 1-7 days | 28.5% |
| Very slow | > 7 days | 14.8% |
Cycle Time & PR Metrics
Cycle Time by Team Size
| Team Size | Median | 75th Pct | Top Decile |
|---|---|---|---|
| Small (1-10) | 26 hours | 15 hours | < 8 hours |
| Medium (11-50) | 48 hours | 28 hours | < 14 hours |
| Large (51-200) | 72 hours | 48 hours | < 24 hours |
| Enterprise (200+) | 120 hours | 72 hours | < 36 hours |
Developer Experience
| Metric | Median | 75th Pct | Top Decile |
|---|---|---|---|
| Focus hours/day | 4.2 | 5.8 | 6.5+ |
| PRs merged/month | 12.4 | 18.5 | 22+ |
| PR review time | 12-24 hrs | 4-12 hrs | < 4 hrs |
| Median PR size | 110 lines | 85 lines | < 60 lines |
Composite Metrics & Rules of Thumb
| Rule | Formula / Threshold | Interpretation |
|---|---|---|
| Deploy > 1/day | Deployments per day | High-performing pipeline |
| Lead time < 24 hrs | Commit to production | Fast, reliable delivery |
| CFR < 10% | Failed / total deploys | Stable delivery quality |
| FDRT < 4 hrs | Time to restore | Strong incident response |
| PR review < 12 hrs | Time to first review | No review bottleneck |
| PR size < 100 lines | Lines changed per PR | Reviewable, low-risk changes |
| Focus > 4 hrs/day | Uninterrupted work hours | Sufficient deep work |
| Cycle time < 3 days | First commit to merge | Healthy development flow |
Segment Definitions
| Segment | Definition | Typical Characteristics |
|---|---|---|
| Small (1-10) | Early-stage startup engineering | Low coordination, high per-person output |
| Medium (11-50) | Growth-stage or small product team | Emerging processes, team leads |
| Large (51-200) | Scale-up or enterprise product org | Formal processes, platform teams |
| Enterprise (200+) | Large company engineering org | Complex governance, multiple squads |
Year-over-Year Trend Summary
| Metric | 2024 | 2025 | 2026 (proj.) | Direction |
|---|---|---|---|---|
| % deploying daily+ | 34% | 38% | 40-43% | ↑ 4-5pp |
| Median lead time | 4.5 days | 3.8 days | 3.2-3.6 days | ↓ improving |
| Median CFR | 14% | 16% | 15-18% | ↑ AI-driven |
| Median FDRT | 6 hrs | 5 hrs | 4-5 hrs | ↓ improving |
| Median cycle time | 56 hrs | 48 hrs | 40-46 hrs | ↓ 15-20% |
| Median focus hrs/day | 4.0 | 4.2 | 4.3-4.5 | ↑ 0.2-0.3 |
| Median PR size | 125 lines | 110 lines | 95-105 lines | ↓ improving |
Common Misinterpretations
- DORA as complete picture: DORA measures delivery capability, not value. A team deploying 10x/day may still build wrong features.
- Comparing across team sizes: 5-person team deploying 3x/day is not "more productive" than 200-person org deploying 1x/day. Normalize by size.
- Conflating AI speed with productivity: AI reduces cycle time 30-40% but increases CFR 15-25%. Net impact depends on failure resolution speed.
- Individual metrics for performance reviews: These are team/system measures. Using PR count to evaluate individuals incentivizes gaming and damages culture.
When This Matters
Fetch when a user asks about engineering team performance benchmarks, needs to set DORA metric targets, is evaluating developer productivity tools, or needs to benchmark cycle time and deployment practices by team size.