Evaluates data strategy maturity across six dimensions: architecture, data quality and governance, analytics capability, ML/AI readiness, data democratization, and data security. Identifies the weakest links constraining overall data capability and routes to specific improvement paths. The output enables data leaders to prioritize investments and build a sequenced roadmap. [src1]
What this measures: The structural foundation — how data is stored, moved, integrated, and made available across the organization.
| Score | Level | Description | Evidence |
|---|---|---|---|
| 1 | Ad hoc | No coherent architecture; data in app DBs and spreadsheets; no warehouse | No data catalog; manual ETL scripts; no documentation |
| 2 | Emerging | Central warehouse exists but incomplete; batch ETL with frequent failures | 30-60% source coverage; ETL failure >10%; stale documentation |
| 3 | Defined | Modern data stack — warehouse, orchestration, documented pipelines | 70-85% coverage; <5% ETL failures; dbt transformations; lineage exists |
| 4 | Managed | Domain-oriented architecture; real-time + batch; data contracts; IaC | Streaming pipelines; SLA-backed freshness; <1% failures; cost tracking |
| 5 | Optimized | Self-healing, multi-region architecture; data products with SLAs | Auto-scaling; data product marketplace; sub-second freshness |
Red flags: No one can draw the architecture; warehouse "on roadmap" for a year; ETL breaks weekly with no alerting. [src7]
Quick diagnostic question: "Can you show me a diagram of how data flows from production to analytics, and when was it last updated?"
What this measures: How reliably data reflects reality and whether formal governance structures ensure quality over time.
| Score | Level | Description | Evidence |
|---|---|---|---|
| 1 | Ad hoc | No quality monitoring; issues found when reports look wrong | No metrics; duplicates widespread; no data dictionary |
| 2 | Emerging | Some quality checks in BI layer; reactive resolution | Spot checks; informal ownership; dictionary <30% coverage |
| 3 | Defined | Quality framework established; formal stewards; governance council | Quality dashboards; automated validation; monthly council meetings |
| 4 | Managed | Continuous monitoring with alerting; data contracts; MDM in place | >95% quality on critical data; anomaly detection; contracts enforced |
| 5 | Optimized | AI-powered monitoring; self-healing; zero-trust verification | ML anomaly detection; auto-remediation; quality is a KPI |
Red flags: Different teams report different numbers for the same metric; poor data quality costs 10-20% of revenue per Gartner. [src4]
Quick diagnostic question: "If two departments pull the same revenue number right now, would they match?"
What this measures: Ability to extract, analyze, and act on insights — from basic reporting to advanced analytics and self-service.
| Score | Level | Description | Evidence |
|---|---|---|---|
| 1 | Ad hoc | No analytics beyond spreadsheets; manual, inconsistent reports | Excel as primary tool; reports take days; gut-feel decisions |
| 2 | Emerging | BI tool deployed but <20% adoption; static dashboards | 3-5 dashboards; no self-service; 2-5 day ad hoc turnaround |
| 3 | Defined | Self-service analytics; semantic layer; standard KPIs agreed | 30-50% self-service; KPI framework documented; 1-day ad hoc |
| 4 | Managed | Advanced analytics — A/B testing, cohort analysis; analytics engineering | 60-80% self-service; experimentation platform; data literacy training |
| 5 | Optimized | Real-time operational analytics; embedded in products; NLP queries | Predictive models in production; analytics drives pricing/personalization |
Red flags: Leadership does not look at dashboards weekly; decisions come from HiPPO; data team backlog >3 months. [src5]
Quick diagnostic question: "When your CEO asks a business question, how long does it take to get an answer — and does it come from a dashboard, a person, or a spreadsheet?"
What this measures: Whether the organization has data foundations, infrastructure, talent, and processes for ML/AI in production.
| Score | Level | Description | Evidence |
|---|---|---|---|
| 1 | Ad hoc | No ML capability; data not suitable for training; AI is a buzzword | No labeled datasets; no feature store; no models in production |
| 2 | Emerging | Exploratory ML in notebooks; POCs built but none in production | 1-3 data scientists; manual labeling; no MLOps |
| 3 | Defined | ML pipeline established; at least one model in production | Feature store; 1-5 models live; basic monitoring; GPU budget allocated |
| 4 | Managed | MLOps platform; automated training, versioning, A/B testing | MLflow/SageMaker/Vertex AI; model registry; 5-20 models; bias detection |
| 5 | Optimized | AI-first organization; GenAI/LLM deployed; RAG pipelines; AI governance | LLM fine-tuning; vector databases; 20+ models; AI ethics board |
Red flags: Data scientists spend >60% time on data prep; investing in LLM without basic analytics maturity. [src6]
Quick diagnostic question: "How many ML models are in production, and how do you know they are still performing well?"
What this measures: How broadly data access and literacy extend across the organization.
| Score | Level | Description | Evidence |
|---|---|---|---|
| 1 | Ad hoc | Data access restricted to engineering; all requests go through bottleneck teams | No self-service; requests take >1 week; shadow spreadsheets |
| 2 | Emerging | Some BI access; data catalog started but incomplete | 10-20% use data tools; catalog <30% coverage; no literacy program |
| 3 | Defined | Self-service available; catalog maintained; RBAC; training offered | 40-60% regular users; critical datasets cataloged; data champions |
| 4 | Managed | Data-literate culture; internal marketplace; automated access provisioning | 60-80% active users; cross-team data projects; data in onboarding |
| 5 | Optimized | Data embedded in every role; NLP querying; data culture as hiring filter | >80% weekly engagement; data skills in every job description |
Red flags: Business teams maintain parallel spreadsheets; data team has 50+ ticket backlog; executives ask for reports that exist in dashboards. [src2]
Quick diagnostic question: "If a product manager needs yesterday's conversion rate right now, can they get it themselves?"
What this measures: How well the organization protects sensitive data, complies with regulations, and manages data risk.
| Score | Level | Description | Evidence |
|---|---|---|---|
| 1 | Ad hoc | No classification; PII in plain text; no encryption strategy | PII in logs/spreadsheets; prod data in dev; access not audited |
| 2 | Emerging | Basic classification; encryption at rest; some access controls | Critical tables classified; basic encryption; quarterly access reviews |
| 3 | Defined | Formal classification; column-level encryption; privacy impact assessments | 80%+ assets classified; PII masked in non-prod; monthly access reviews |
| 4 | Managed | Automated discovery/classification; dynamic masking; compliance reporting | Auto PII detection; SIEM integration; data residency controls |
| 5 | Optimized | Zero-trust data security; confidential computing; AI threat detection | Zero-trust architecture; real-time compliance; data ethics framework |
Red flags: Production data on laptops; PII in non-production environments; no retention policy enforced; last access audit >1 year ago. [src3]
Quick diagnostic question: "If I asked where all PII lives in your systems, how long would it take to produce a complete inventory?"
Formula: Overall Score = (Architecture + Quality & Governance + Analytics + ML/AI Readiness + Democratization + Security) / 6
For regulated industries, weight Security at 1.5x. For AI-heavy companies, weight ML/AI Readiness at 1.5x.
| Overall Score | Level | Interpretation | Next Step |
|---|---|---|---|
| 1.0 - 1.9 | Critical | No coherent data strategy; data is a liability; ML investment would be wasted | Start with architecture and governance foundations |
| 2.0 - 2.9 | Developing | Basic infrastructure but underutilized; significant quality gaps | Close quality gaps; establish governance council; build self-service |
| 3.0 - 3.9 | Competent | Solid foundation; ready for advanced analytics and initial ML | Invest in ML capabilities; mature data contracts; begin MLOps |
| 4.0 - 4.5 | Advanced | Data is a strategic asset; ML in production; data-informed culture | Optimize costs; evaluate GenAI/LLM; build data products |
| 4.6 - 5.0 | Best-in-class | Data-driven organization; AI-first approach; data as competitive moat | Maintain leadership; explore emerging paradigms |
| Weak Dimension (Score < 3) | Fetch This Card |
|---|---|
| Data Architecture | Data Platform Selection |
| Data Quality & Governance | Data Governance Framework |
| Analytics Capability | Analytics Stack Selection |
| ML/AI Readiness | ML Ops Maturity Assessment |
| Data Democratization | Data Literacy Program |
| Data Security & Privacy | Data Privacy Compliance Framework |
| Segment | Expected Average | "Good" Threshold | "Alarm" Threshold |
|---|---|---|---|
| Startup (<50 employees) | 1.8 | 2.5 | 1.0 |
| Growth (50-500) | 2.7 | 3.3 | 2.0 |
| Enterprise (500-5000) | 3.4 | 4.0 | 2.5 |
| Large Enterprise (5000+) | 3.8 | 4.3 | 3.0 |
Industry modifiers: Financial services and healthcare add +0.5 to all thresholds. SaaS/technology typically scores 0.3-0.5 higher than average. [src5]
Fetch when a user asks to evaluate data maturity, diagnose why data or analytics initiatives are failing, prepare a data strategy roadmap, justify data infrastructure investment, prepare for AI/ML adoption, or conduct due diligence on data capabilities.