Database Sharding Strategies: A Complete System Design Guide

Type: Software Reference Confidence: 0.93 Sources: 7 Verified: 2026-02-23 Freshness: quarterly

TL;DR

Constraints

Quick Reference

Strategy How It Works Distribution Range Queries Resharding Cost Best For Worst For
Hash-based shard = hash(key) % N Uniform (even) Poor (scatter-gather) High (rehash all data) High-write workloads, random access by ID Time-series, range scans
Consistent hashing Hash ring with virtual nodes Uniform Poor Low (only K/N keys move) Elastic scaling, frequent shard add/remove Range queries, ordered access
Range-based shard = key_range(value) Depends on data Excellent (single shard) Medium (split/merge ranges) Time-series, ordered data, date partitions Monotonic keys (hot last shard)
Geo-based shard = region(location) Uneven (population) Good within region Low (region reassignment) Multi-region apps, data residency compliance Global queries, small datasets
Directory-based Lookup table maps key to shard Configurable Depends on mapping Low (update directory) Complex routing, tenant isolation High throughput (directory = bottleneck)
Hash + range (compound) Hash on tenant, range on time Balanced Good within tenant Medium Multi-tenant SaaS with time-series Simple applications
Schema-based Each tenant gets own schema Isolated Full SQL within tenant Low (move schema) Strong tenant isolation, compliance Large tenant counts (>10K schemas)
Algorithmic (jump hash) jump_consistent_hash(key, N) Near-perfect Poor Very low (minimal movement) Caching layers, stateless routing Dynamic shard topology

Decision Tree

START -- What is your primary access pattern?
|
+-- Single-entity lookup (user_id, order_id, tenant_id)?
|   +-- Need elastic scaling (frequently add/remove shards)?
|   |   +-- YES --> Consistent hashing (virtual nodes)
|   |   +-- NO  --> Hash-based sharding (simple modulo or jump hash)
|   |
+-- Range scans (time-series, date ranges, sequential IDs)?
|   +-- Data is append-only (logs, events, IoT)?
|   |   +-- YES --> Range-based with time buckets (e.g., monthly shards)
|   |   +-- NO  --> Range-based with dynamic range splitting
|   |
+-- Geographic queries (user location, data residency)?
|   +-- YES --> Geo-based sharding (shard per region/country)
|   |
+-- Multi-tenant SaaS with mixed access patterns?
|   +-- <1K tenants with strong isolation needs?
|   |   +-- YES --> Schema-based sharding (Citus schema-based)
|   +-- >1K tenants?
|       +-- YES --> Hash on tenant_id (co-locate all tenant data)
|
+-- Unknown or complex routing needs?
    +-- YES --> Directory-based (lookup table with flexibility)
    +-- DEFAULT --> Start with hash-based on most-queried entity ID

Step-by-Step Guide

1. Identify your shard key candidates

Analyze your query patterns to find the column that appears in the WHERE clause of 80%+ of your queries. The shard key must have high cardinality (thousands+ distinct values) and appear in most queries to avoid scatter-gather operations. [src7]

-- PostgreSQL: Find most common WHERE clause columns
SELECT query, calls, mean_exec_time
FROM pg_stat_statements
ORDER BY calls DESC LIMIT 20;

Verify: The top 5 queries all include your candidate shard key in their WHERE clause.

2. Evaluate shard key quality

A good shard key has high cardinality, high frequency in queries, and low monotonicity. Test distribution before committing. [src3]

-- Check cardinality: should be >10x your planned shard count
SELECT COUNT(DISTINCT tenant_id) AS cardinality FROM orders;

-- Check distribution: no single value should exceed 5% of total
SELECT tenant_id, COUNT(*) AS row_count,
       ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER(), 2) AS pct
FROM orders GROUP BY tenant_id ORDER BY row_count DESC LIMIT 20;

Verify: cardinality > 10 * planned_shards and max(pct) < 5%.

3. Choose your sharding strategy

Based on the decision tree, select hash, range, geo, or directory-based sharding. For most OLTP applications, hash-based on tenant_id or user_id is the safest default. [src6]

Verify: Your chosen strategy handles your top 5 queries without cross-shard operations.

4. Implement sharding with your chosen technology

Deploy using a proven sharding middleware or native distributed database rather than building custom sharding logic. [src1]

-- PostgreSQL + Citus (hash distribution)
SELECT create_distributed_table('orders', 'tenant_id');
SELECT create_distributed_table('order_items', 'tenant_id',
       colocate_with => 'orders');

Verify: SELECT COUNT(*) FROM citus_shards; shows data distributed across all nodes.

5. Set up monitoring and rebalancing

Monitor shard distribution, query routing, and hotspots. Set up alerts for shard imbalance exceeding 20%. [src5]

-- Citus: Check shard sizes and distribution
SELECT nodename, COUNT(*) AS shard_count,
       pg_size_pretty(SUM(shard_size)) AS total_size
FROM citus_shards GROUP BY nodename;

Verify: Shard sizes are within 20% of each other. No single shard handles >30% of total queries.

Code Examples

Python: Consistent Hashing Ring Implementation

# Input:  List of shard nodes + data keys to route
# Output: Shard assignment for each key with minimal disruption on node changes
import hashlib
from bisect import bisect_right

class ConsistentHashRing:
    def __init__(self, nodes=None, virtual_nodes=150):
        self.ring, self.sorted_keys = {}, []
        self.virtual_nodes = virtual_nodes
        for node in (nodes or []): self.add_node(node)

    def _hash(self, key): return int(hashlib.md5(key.encode()).hexdigest(), 16)

    def add_node(self, node):
        for i in range(self.virtual_nodes):
            h = self._hash(f"{node}:vn{i}")
            self.ring[h] = node; self.sorted_keys.append(h)
        self.sorted_keys.sort()

Full script: python-consistent-hashing-ring-implementation.py (42 lines)

SQL/Vitess: VSchema Configuration for Hash-Based Sharding

// Vitess vschema -- defines sharding strategy for MySQL
{
  "sharded": true,
  "vindexes": { "hash_tenant": { "type": "hash" } },
  "tables": {
    "orders": {
      "column_vindexes": [{ "column": "tenant_id", "name": "hash_tenant" }]
    }
  }
}

Full config: vschema-configuration.json (31 lines)

Go: Application-Level Shard Router

// Input:  Shard key (tenant_id) + database connections
// Output: Routes queries to correct shard based on hash
func (r *ShardRouter) GetShard(tenantID string) *sql.DB {
    h := fnv.New32a()
    h.Write([]byte(tenantID))
    idx := h.Sum32() % r.count
    return r.shards[idx]
}

Full script: go-application-level-shard-router.go (34 lines)

Anti-Patterns

Wrong: Sharding by auto-increment ID with modulo

-- BAD -- Modulo on auto-increment creates hot shard for recent data.
-- All new inserts go to shard = (max_id + 1) % N, hammering one shard.
INSERT INTO orders_shard_%s VALUES (...);  -- shard = order_id % 4

Correct: Hash the shard key with consistent hashing

-- GOOD -- Hash distributes evenly regardless of key ordering.
-- Adding a 5th shard moves only ~20% of data (1/N).
SELECT create_distributed_table('orders', 'tenant_id');  -- Citus handles hashing

Wrong: Using a low-cardinality column as shard key

# BAD -- country_code has ~200 values. With 16 shards, some shards
# get USA (40% of data) while others get Liechtenstein (0.001%).
shard = hash(row["country_code"]) % 16  # massive skew

Correct: Use a high-cardinality key, add geo-routing at application level

# GOOD -- Shard by tenant_id (millions of values = even distribution).
# Handle geo-routing in the application layer, not the shard key.
shard = consistent_hash(row["tenant_id"])
cluster = geo_directory[row["region"]]

Wrong: Cross-shard JOINs in hot-path queries

-- BAD -- This JOIN hits every shard, aggregates at coordinator.
-- 16 shards = 16 network round trips + merge. Latency: 50-500ms.
SELECT o.*, u.name FROM orders o
JOIN users u ON o.user_id = u.user_id
WHERE o.created_at > NOW() - INTERVAL '1 hour';

Correct: Co-locate related tables on the same shard key

-- GOOD -- Both tables sharded by tenant_id = JOIN stays on one shard.
SELECT create_distributed_table('orders', 'tenant_id');
SELECT create_distributed_table('users', 'tenant_id',
       colocate_with => 'orders');
-- Query stays on one shard:
SELECT o.*, u.name FROM orders o
JOIN users u ON o.tenant_id = u.tenant_id AND o.user_id = u.user_id
WHERE o.tenant_id = 'abc123';

Wrong: Building custom sharding logic in application code

# BAD -- Homegrown shard routing = bugs, no rebalancing, no failover.
def get_connection(user_id):
    shard_id = hash(user_id) % len(shards)
    return psycopg2.connect(shards[shard_id])
# What about transactions? Failover? Schema migrations?

Correct: Use a proven sharding middleware

# GOOD -- Let Citus, Vitess, or ProxySQL handle routing, rebalancing,
# failover, and schema migrations. You write normal SQL.
conn = psycopg2.connect("host=citus-coordinator dbname=myapp")
cur = conn.cursor()
cur.execute("SELECT * FROM orders WHERE tenant_id = %s", ("abc123",))

Common Pitfalls

Diagnostic Commands

# PostgreSQL + Citus: Check shard distribution
psql -c "SELECT nodename, COUNT(*) AS shards, pg_size_pretty(SUM(shard_size))
         FROM citus_shards GROUP BY nodename;"

# Citus: Find scatter-gather queries (hitting multiple shards)
psql -c "SELECT query, calls, mean_exec_time
         FROM citus_stat_statements
         WHERE partition_count > 1 ORDER BY calls DESC LIMIT 10;"

# MongoDB: Check balancer status and chunk distribution
mongosh --eval 'sh.status()'
mongosh --eval 'db.adminCommand({ balancerStatus: 1 })'

# Vitess: List tablets and their health
vtctlclient ListAllTablets | grep -v healthy

# General: Check shard key cardinality
psql -c "SELECT COUNT(DISTINCT tenant_id) AS cardinality,
                COUNT(*) AS total_rows FROM orders;"

Version History & Compatibility

Technology Version Sharding Features Notes
PostgreSQL + Citus13.0 (2025)Schema-based sharding, shard rebalancer, CDCCitus 12 introduced schema-based sharding (2023)
Vitess21.x (2025)VReplication, online resharding, VDiffUsed by Slack, GitHub, PlanetScale. CNCF graduated
MongoDB8.0 (2024)Hashed + ranged sharding, zone shardingOnline resharding added in 6.0 (2022)
CockroachDB24.x (2025)Automatic range splitting, hash-sharded indexesTransparent sharding -- no manual shard management
TiDB8.x (2025)Auto-sharding, TiFlash columnar, placement rulesMySQL-compatible, auto-splits at 96MB regions
YugabyteDB2024.xHash + range sharding, tablespace geo-partitioningPostgreSQL-compatible, automatic tablet splitting
Cassandra5.0 (2024)Consistent hashing (Murmur3), virtual nodesAlways sharded -- no single-node mode
ScyllaDB6.x (2025)Tablets (replacing vnodes), Raft-based topologyCassandra-compatible, 10x lower latency

When to Use / When Not to Use

Use When Don't Use When Use Instead
Single database exceeds 1-5TB and vertical scaling is maxedDatabase is <500GB with room to scale verticallyAdd more RAM/CPU, optimize queries, add read replicas
Write throughput exceeds single-node capacity (>50K writes/sec)Read-heavy workload with low write volumeRead replicas with connection routing
Data residency requirements mandate geographic isolationAll users are in one regionSingle-region deployment with edge caching
Multi-tenant SaaS needs tenant-level isolation and scaling<100 tenants with similar data volumesSchema-per-tenant or row-level security on single DB
Need to scale horizontally beyond single-machine limitsApplication can tolerate latency from unoptimized indexesIndex optimization and query tuning first
Compliance requires data in specific jurisdictions (GDPR, CCPA)No regulatory data residency requirementsSingle deployment with encryption at rest

Important Caveats

Related Units