| Approach | Keys Moved on Node Change | Load Balance | Lookup Time | Memory | Best Use Case |
|---|---|---|---|---|---|
Modular hashing (key % n) | ~100% (full rehash) | Perfect (uniform) | O(1) | O(1) | Static cluster, never changes |
| Ring hashing (no vnodes) | K/n average | Poor (O(log n) imbalance) | O(log n) | O(n) | Prototype only |
| Ring hashing + virtual nodes | K/n average | Good (tunable via vnode count) | O(log n) | O(n * v) | Cache clusters, DB sharding |
| Jump hash (Google, 2014) | K/n optimal | Near-perfect (0.0000008% std dev) | O(ln n) | O(1) | Static shard count, append-only |
| Rendezvous / HRW hashing | K/n optimal | Good | O(n) per lookup | O(1) | Small clusters, k-replication |
| Maglev hashing (Google) | K/n near-optimal | Near-perfect | O(1) lookup table | O(n * M) | L4 load balancers |
| CRUSH (Ceph) | K/n optimal | Configurable (weighted) | O(log n) | O(n) | Storage clusters with failure domains |
| Ketama (libmemcached) | K/n average | Good (150 vnodes default) | O(log n) | O(n * 150) | Memcached clusters |
START: What is your distributed hashing use case?
|
+-- Is the node set static (rarely add/remove nodes)?
| +-- YES: Is the bucket count append-only?
| | +-- YES --> Use Jump Hash (perfect balance, zero memory overhead)
| | +-- NO --> Use modular hashing (simplest, perfect balance)
| +-- NO (dynamic node membership) ↓
|
+-- How many nodes in the cluster?
| +-- < 3 nodes --> Use modular hashing (overhead not justified)
| +-- 3-20 nodes ↓
| | +-- Need k-way replication with no ring metadata?
| | | +-- YES --> Use Rendezvous (HRW) hashing
| | | +-- NO --> Use Ring Hash + 100-200 virtual nodes per node
| +-- 20+ nodes ↓
| +-- L4 load balancing?
| | +-- YES --> Use Maglev hashing (O(1) lookup)
| | +-- NO --> Use Ring Hash + virtual nodes
|
+-- Storage cluster with failure domains (rack/DC awareness)?
+-- YES --> Use CRUSH algorithm (Ceph-style)
+-- NO --> Use Ring Hash + virtual nodes (default recommendation)
Map both nodes and keys onto a circular hash space [0, 2^32 - 1]. Use a uniform hash function like MD5, SHA-1, or xxHash. [src1]
import hashlib
RING_SIZE = 2**32 # Standard 32-bit hash space
def hash_key(key: str) -> int:
digest = hashlib.md5(key.encode('utf-8')).hexdigest()
return int(digest, 16) % RING_SIZE
Verify: hash_key("my-key-1") → returns a deterministic integer in range [0, 4294967295].
Hash each node identifier to get its position on the ring. [src2]
nodes = ["cache-01", "cache-02", "cache-03"]
node_positions = {hash_key(node): node for node in nodes}
sorted_positions = sorted(node_positions.keys())
Verify: Each node has a unique position; len(sorted_positions) == len(nodes).
Create 100-200 hash positions per physical node by appending a suffix. [src5]
VIRTUAL_NODES = 150
ring = {}
for node in nodes:
for i in range(VIRTUAL_NODES):
pos = hash_key(f"{node}#vn{i}")
ring[pos] = node
sorted_keys = sorted(ring.keys())
Verify: len(sorted_keys) == len(nodes) * VIRTUAL_NODES (e.g., 3 * 150 = 450).
Find the first node position clockwise using binary search for O(log n) lookup. [src2]
import bisect
def get_node(key: str) -> str:
pos = hash_key(key)
idx = bisect.bisect_right(sorted_keys, pos)
if idx == len(sorted_keys):
idx = 0 # Wrap around the ring
return ring[sorted_keys[idx]]
Verify: get_node("user:42") → returns same node name consistently across calls.
Add virtual nodes to the ring. Only keys between the new node and its predecessor need reassignment. [src6]
def add_node(node: str):
for i in range(VIRTUAL_NODES):
pos = hash_key(f"{node}#vn{i}")
ring[pos] = node
sorted_keys.clear()
sorted_keys.extend(sorted(ring.keys()))
Verify: After adding a 4th node, approximately 25% of keys remap (K/4).
Remove all virtual nodes belonging to the departing node. Keys are automatically reassigned to the next clockwise node. [src6]
def remove_node(node: str):
positions_to_remove = [pos for pos, n in ring.items() if n == node]
for pos in positions_to_remove:
del ring[pos]
sorted_keys.clear()
sorted_keys.extend(sorted(ring.keys()))
Verify: After removing 1 node from a 4-node cluster, approximately 25% of keys remap.
# Input: list of node names, number of virtual nodes
# Output: ConsistentHashRing with get_node(), add_node(), remove_node()
import hashlib
import bisect
from typing import Optional
class ConsistentHashRing:
def __init__(self, nodes: list[str] = None, vnodes: int = 150):
self.vnodes = vnodes
self._ring: dict[int, str] = {}
self._sorted_keys: list[int] = []
for node in (nodes or []):
self.add_node(node)
def _hash(self, key: str) -> int:
return int(hashlib.md5(key.encode()).hexdigest(), 16) % (2**32)
def add_node(self, node: str) -> None:
for i in range(self.vnodes):
pos = self._hash(f"{node}#vn{i}")
self._ring[pos] = node
self._sorted_keys = sorted(self._ring.keys())
def remove_node(self, node: str) -> None:
self._ring = {k: v for k, v in self._ring.items() if v != node}
self._sorted_keys = sorted(self._ring.keys())
def get_node(self, key: str) -> Optional[str]:
if not self._ring:
return None
pos = self._hash(key)
idx = bisect.bisect_right(self._sorted_keys, pos) % len(self._sorted_keys)
return self._ring[self._sorted_keys[idx]]
// Input: list of node names, replicas count
// Output: Ring struct with Get(), Add(), Remove()
package chash
import (
"crypto/md5"
"encoding/binary"
"fmt"
"sort"
"sync"
)
type Ring struct {
mu sync.RWMutex
nodes map[uint32]string
keys []uint32
replicas int
}
func New(replicas int, nodes ...string) *Ring {
r := &Ring{replicas: replicas, nodes: make(map[uint32]string)}
for _, n := range nodes {
r.Add(n)
}
return r
}
func (r *Ring) hash(key string) uint32 {
h := md5.Sum([]byte(key))
return binary.BigEndian.Uint32(h[:4])
}
func (r *Ring) Add(node string) {
r.mu.Lock()
defer r.mu.Unlock()
for i := 0; i < r.replicas; i++ {
h := r.hash(fmt.Sprintf("%s#vn%d", node, i))
r.nodes[h] = node
r.keys = append(r.keys, h)
}
sort.Slice(r.keys, func(i, j int) bool { return r.keys[i] < r.keys[j] })
}
func (r *Ring) Get(key string) string {
r.mu.RLock()
defer r.mu.RUnlock()
if len(r.keys) == 0 {
return ""
}
h := r.hash(key)
idx := sort.Search(len(r.keys), func(i int) bool { return r.keys[i] >= h })
if idx == len(r.keys) {
idx = 0
}
return r.nodes[r.keys[idx]]
}
// Input: collection of node names, virtual node count
// Output: ConsistentHashRing with getNode(), addNode(), removeNode()
import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;
import java.util.*;
import java.util.concurrent.ConcurrentSkipListMap;
public class ConsistentHashRing {
private final int vnodeCount;
private final ConcurrentSkipListMap<Long, String> ring = new ConcurrentSkipListMap<>();
public ConsistentHashRing(int vnodeCount) { this.vnodeCount = vnodeCount; }
private long hash(String key) {
try {
MessageDigest md = MessageDigest.getInstance("MD5");
byte[] d = md.digest(key.getBytes(StandardCharsets.UTF_8));
return ((long)(d[0]&0xFF) << 24) | ((long)(d[1]&0xFF) << 16)
| ((long)(d[2]&0xFF) << 8) | (d[3]&0xFF);
} catch (Exception e) { throw new RuntimeException(e); }
}
public void addNode(String node) {
for (int i = 0; i < vnodeCount; i++)
ring.put(hash(node + "#vn" + i), node);
}
public void removeNode(String node) {
for (int i = 0; i < vnodeCount; i++)
ring.remove(hash(node + "#vn" + i));
}
public String getNode(String key) {
if (ring.isEmpty()) return null;
long h = hash(key);
Map.Entry<Long, String> entry = ring.ceilingEntry(h);
return (entry != null) ? entry.getValue() : ring.firstEntry().getValue();
}
}
# BAD -- adding/removing a server rehashes nearly every key
def get_server(key, servers):
return servers[hash(key) % len(servers)]
# If servers changes from 3 to 4, ~75% of keys remap
# GOOD -- only K/n keys remap when a node joins or leaves
ring = ConsistentHashRing(nodes=["s1", "s2", "s3"], vnodes=150)
server = ring.get_node(key)
# Adding "s4" remaps only ~25% of keys
# BAD -- 3 physical nodes on a ring gives wildly uneven load
ring = {}
for node in nodes:
ring[hash_key(node)] = node # Only 1 position per node!
# One node may get 60% of keys, another 10%
# GOOD -- 150 virtual nodes per physical node ensures even distribution
ring = ConsistentHashRing(nodes=["s1", "s2", "s3"], vnodes=150)
# Standard deviation of load drops proportionally to 1/sqrt(vnodes)
# BAD -- fails when key hash > all node positions
def get_node_broken(key, sorted_keys, ring):
pos = hash_key(key)
for k in sorted_keys:
if k >= pos:
return ring[k]
return None # BUG: returns None instead of wrapping to first node
# GOOD -- wraps around the ring correctly
def get_node_correct(key, sorted_keys, ring):
pos = hash_key(key)
idx = bisect.bisect_right(sorted_keys, pos)
if idx == len(sorted_keys):
idx = 0 # Wrap to first node on the ring
return ring[sorted_keys[idx]]
1/sqrt(vnodes). [src5]hash() is randomized per process (PYTHONHASHSEED) and not portable. Fix: Use hashlib.md5 or hashlib.sha1 for deterministic, cross-platform hashing. [src3]| Use When | Don't Use When | Use Instead |
|---|---|---|
| Distributed cache (Memcached, Redis) with nodes joining/leaving | Static cluster that never changes size | Modular hashing (key % n) |
| Database sharding with horizontal scaling | Fewer than 3 nodes | Simple modular hashing |
| Load balancing across a dynamic server pool | Need range queries on keys | Range-based partitioning |
| CDN request routing with server churn | Append-only shard count growth | Jump hash (perfect balance, O(1) memory) |
| Need minimal key migration during scaling events | Need built-in k-replication with no ring metadata | Rendezvous (HRW) hashing |
| Peer-to-peer overlay networks (Chord, Dynamo) | Need rack/DC-aware failure domain placement | CRUSH algorithm (Ceph) |