Consistent Hashing: Hash Ring, Virtual Nodes, and Distributed Key Routing

Type: Software Reference Confidence: 0.92 Sources: 7 Verified: 2026-02-24 Freshness: 2026-02-24

TL;DR

Constraints

Quick Reference

ApproachKeys Moved on Node ChangeLoad BalanceLookup TimeMemoryBest Use Case
Modular hashing (key % n)~100% (full rehash)Perfect (uniform)O(1)O(1)Static cluster, never changes
Ring hashing (no vnodes)K/n averagePoor (O(log n) imbalance)O(log n)O(n)Prototype only
Ring hashing + virtual nodesK/n averageGood (tunable via vnode count)O(log n)O(n * v)Cache clusters, DB sharding
Jump hash (Google, 2014)K/n optimalNear-perfect (0.0000008% std dev)O(ln n)O(1)Static shard count, append-only
Rendezvous / HRW hashingK/n optimalGoodO(n) per lookupO(1)Small clusters, k-replication
Maglev hashing (Google)K/n near-optimalNear-perfectO(1) lookup tableO(n * M)L4 load balancers
CRUSH (Ceph)K/n optimalConfigurable (weighted)O(log n)O(n)Storage clusters with failure domains
Ketama (libmemcached)K/n averageGood (150 vnodes default)O(log n)O(n * 150)Memcached clusters

Decision Tree

START: What is your distributed hashing use case?
|
+-- Is the node set static (rarely add/remove nodes)?
|   +-- YES: Is the bucket count append-only?
|   |   +-- YES --> Use Jump Hash (perfect balance, zero memory overhead)
|   |   +-- NO --> Use modular hashing (simplest, perfect balance)
|   +-- NO (dynamic node membership) ↓
|
+-- How many nodes in the cluster?
|   +-- < 3 nodes --> Use modular hashing (overhead not justified)
|   +-- 3-20 nodes ↓
|   |   +-- Need k-way replication with no ring metadata?
|   |   |   +-- YES --> Use Rendezvous (HRW) hashing
|   |   |   +-- NO --> Use Ring Hash + 100-200 virtual nodes per node
|   +-- 20+ nodes ↓
|       +-- L4 load balancing?
|       |   +-- YES --> Use Maglev hashing (O(1) lookup)
|       |   +-- NO --> Use Ring Hash + virtual nodes
|
+-- Storage cluster with failure domains (rack/DC awareness)?
    +-- YES --> Use CRUSH algorithm (Ceph-style)
    +-- NO --> Use Ring Hash + virtual nodes (default recommendation)

Step-by-Step Guide

1. Define the hash ring space

Map both nodes and keys onto a circular hash space [0, 2^32 - 1]. Use a uniform hash function like MD5, SHA-1, or xxHash. [src1]

import hashlib

RING_SIZE = 2**32  # Standard 32-bit hash space

def hash_key(key: str) -> int:
    digest = hashlib.md5(key.encode('utf-8')).hexdigest()
    return int(digest, 16) % RING_SIZE

Verify: hash_key("my-key-1") → returns a deterministic integer in range [0, 4294967295].

2. Place physical nodes on the ring

Hash each node identifier to get its position on the ring. [src2]

nodes = ["cache-01", "cache-02", "cache-03"]
node_positions = {hash_key(node): node for node in nodes}
sorted_positions = sorted(node_positions.keys())

Verify: Each node has a unique position; len(sorted_positions) == len(nodes).

3. Add virtual nodes for load balance

Create 100-200 hash positions per physical node by appending a suffix. [src5]

VIRTUAL_NODES = 150

ring = {}
for node in nodes:
    for i in range(VIRTUAL_NODES):
        pos = hash_key(f"{node}#vn{i}")
        ring[pos] = node
sorted_keys = sorted(ring.keys())

Verify: len(sorted_keys) == len(nodes) * VIRTUAL_NODES (e.g., 3 * 150 = 450).

4. Route keys to nodes via clockwise lookup

Find the first node position clockwise using binary search for O(log n) lookup. [src2]

import bisect

def get_node(key: str) -> str:
    pos = hash_key(key)
    idx = bisect.bisect_right(sorted_keys, pos)
    if idx == len(sorted_keys):
        idx = 0  # Wrap around the ring
    return ring[sorted_keys[idx]]

Verify: get_node("user:42") → returns same node name consistently across calls.

5. Handle node addition

Add virtual nodes to the ring. Only keys between the new node and its predecessor need reassignment. [src6]

def add_node(node: str):
    for i in range(VIRTUAL_NODES):
        pos = hash_key(f"{node}#vn{i}")
        ring[pos] = node
    sorted_keys.clear()
    sorted_keys.extend(sorted(ring.keys()))

Verify: After adding a 4th node, approximately 25% of keys remap (K/4).

6. Handle node removal

Remove all virtual nodes belonging to the departing node. Keys are automatically reassigned to the next clockwise node. [src6]

def remove_node(node: str):
    positions_to_remove = [pos for pos, n in ring.items() if n == node]
    for pos in positions_to_remove:
        del ring[pos]
    sorted_keys.clear()
    sorted_keys.extend(sorted(ring.keys()))

Verify: After removing 1 node from a 4-node cluster, approximately 25% of keys remap.

Code Examples

Python: Complete Hash Ring with Virtual Nodes

# Input:  list of node names, number of virtual nodes
# Output: ConsistentHashRing with get_node(), add_node(), remove_node()

import hashlib
import bisect
from typing import Optional

class ConsistentHashRing:
    def __init__(self, nodes: list[str] = None, vnodes: int = 150):
        self.vnodes = vnodes
        self._ring: dict[int, str] = {}
        self._sorted_keys: list[int] = []
        for node in (nodes or []):
            self.add_node(node)

    def _hash(self, key: str) -> int:
        return int(hashlib.md5(key.encode()).hexdigest(), 16) % (2**32)

    def add_node(self, node: str) -> None:
        for i in range(self.vnodes):
            pos = self._hash(f"{node}#vn{i}")
            self._ring[pos] = node
        self._sorted_keys = sorted(self._ring.keys())

    def remove_node(self, node: str) -> None:
        self._ring = {k: v for k, v in self._ring.items() if v != node}
        self._sorted_keys = sorted(self._ring.keys())

    def get_node(self, key: str) -> Optional[str]:
        if not self._ring:
            return None
        pos = self._hash(key)
        idx = bisect.bisect_right(self._sorted_keys, pos) % len(self._sorted_keys)
        return self._ring[self._sorted_keys[idx]]

Go: Hash Ring with Virtual Nodes

// Input:  list of node names, replicas count
// Output: Ring struct with Get(), Add(), Remove()

package chash

import (
    "crypto/md5"
    "encoding/binary"
    "fmt"
    "sort"
    "sync"
)

type Ring struct {
    mu       sync.RWMutex
    nodes    map[uint32]string
    keys     []uint32
    replicas int
}

func New(replicas int, nodes ...string) *Ring {
    r := &Ring{replicas: replicas, nodes: make(map[uint32]string)}
    for _, n := range nodes {
        r.Add(n)
    }
    return r
}

func (r *Ring) hash(key string) uint32 {
    h := md5.Sum([]byte(key))
    return binary.BigEndian.Uint32(h[:4])
}

func (r *Ring) Add(node string) {
    r.mu.Lock()
    defer r.mu.Unlock()
    for i := 0; i < r.replicas; i++ {
        h := r.hash(fmt.Sprintf("%s#vn%d", node, i))
        r.nodes[h] = node
        r.keys = append(r.keys, h)
    }
    sort.Slice(r.keys, func(i, j int) bool { return r.keys[i] < r.keys[j] })
}

func (r *Ring) Get(key string) string {
    r.mu.RLock()
    defer r.mu.RUnlock()
    if len(r.keys) == 0 {
        return ""
    }
    h := r.hash(key)
    idx := sort.Search(len(r.keys), func(i int) bool { return r.keys[i] >= h })
    if idx == len(r.keys) {
        idx = 0
    }
    return r.nodes[r.keys[idx]]
}

Java: Hash Ring with Virtual Nodes

// Input:  collection of node names, virtual node count
// Output: ConsistentHashRing with getNode(), addNode(), removeNode()

import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;
import java.util.*;
import java.util.concurrent.ConcurrentSkipListMap;

public class ConsistentHashRing {
    private final int vnodeCount;
    private final ConcurrentSkipListMap<Long, String> ring = new ConcurrentSkipListMap<>();

    public ConsistentHashRing(int vnodeCount) { this.vnodeCount = vnodeCount; }

    private long hash(String key) {
        try {
            MessageDigest md = MessageDigest.getInstance("MD5");
            byte[] d = md.digest(key.getBytes(StandardCharsets.UTF_8));
            return ((long)(d[0]&0xFF) << 24) | ((long)(d[1]&0xFF) << 16)
                 | ((long)(d[2]&0xFF) << 8) | (d[3]&0xFF);
        } catch (Exception e) { throw new RuntimeException(e); }
    }

    public void addNode(String node) {
        for (int i = 0; i < vnodeCount; i++)
            ring.put(hash(node + "#vn" + i), node);
    }

    public void removeNode(String node) {
        for (int i = 0; i < vnodeCount; i++)
            ring.remove(hash(node + "#vn" + i));
    }

    public String getNode(String key) {
        if (ring.isEmpty()) return null;
        long h = hash(key);
        Map.Entry<Long, String> entry = ring.ceilingEntry(h);
        return (entry != null) ? entry.getValue() : ring.firstEntry().getValue();
    }
}

Anti-Patterns

Wrong: Using modular hashing for dynamic clusters

# BAD -- adding/removing a server rehashes nearly every key
def get_server(key, servers):
    return servers[hash(key) % len(servers)]
# If servers changes from 3 to 4, ~75% of keys remap

Correct: Use consistent hashing for dynamic clusters

# GOOD -- only K/n keys remap when a node joins or leaves
ring = ConsistentHashRing(nodes=["s1", "s2", "s3"], vnodes=150)
server = ring.get_node(key)
# Adding "s4" remaps only ~25% of keys

Wrong: Consistent hashing without virtual nodes

# BAD -- 3 physical nodes on a ring gives wildly uneven load
ring = {}
for node in nodes:
    ring[hash_key(node)] = node  # Only 1 position per node!
# One node may get 60% of keys, another 10%

Correct: Always use 100-200 virtual nodes per physical node

# GOOD -- 150 virtual nodes per physical node ensures even distribution
ring = ConsistentHashRing(nodes=["s1", "s2", "s3"], vnodes=150)
# Standard deviation of load drops proportionally to 1/sqrt(vnodes)

Wrong: Not handling the ring wrap-around

# BAD -- fails when key hash > all node positions
def get_node_broken(key, sorted_keys, ring):
    pos = hash_key(key)
    for k in sorted_keys:
        if k >= pos:
            return ring[k]
    return None  # BUG: returns None instead of wrapping to first node

Correct: Wrap around to the first node

# GOOD -- wraps around the ring correctly
def get_node_correct(key, sorted_keys, ring):
    pos = hash_key(key)
    idx = bisect.bisect_right(sorted_keys, pos)
    if idx == len(sorted_keys):
        idx = 0  # Wrap to first node on the ring
    return ring[sorted_keys[idx]]

Common Pitfalls

When to Use / When Not to Use

Use WhenDon't Use WhenUse Instead
Distributed cache (Memcached, Redis) with nodes joining/leavingStatic cluster that never changes sizeModular hashing (key % n)
Database sharding with horizontal scalingFewer than 3 nodesSimple modular hashing
Load balancing across a dynamic server poolNeed range queries on keysRange-based partitioning
CDN request routing with server churnAppend-only shard count growthJump hash (perfect balance, O(1) memory)
Need minimal key migration during scaling eventsNeed built-in k-replication with no ring metadataRendezvous (HRW) hashing
Peer-to-peer overlay networks (Chord, Dynamo)Need rack/DC-aware failure domain placementCRUSH algorithm (Ceph)

Important Caveats

Related Units