🔁 System Design Series: Design Consistent Hashing


🚀 Introduction
In distributed systems, we often need to distribute data across multiple machines or servers. But what happens when a new server is added or removed?
A naive hash function would require almost complete data reallocation, which is extremely inefficient at scale.
This is where consistent hashing comes in — a beautifully elegant technique used by large-scale systems like Amazon Dynamo, Cassandra, and Redis Cluster to maintain balance and minimize re-distribution when nodes change.
In this post, we’ll break down:
What consistent hashing is
Why it matters in system design
How it works conceptually
Real-world use cases and benefits
Trade-offs and design extensions
🧠 The Problem with Traditional Hashing
Let’s say you use the simplest way to distribute keys:
key % number_of_servers
If you have 4 servers (0, 1, 2, 3), the key user123 might go to:
hash(user123) % 4 → Server 2
But what if you add a 5th server?
Now, hash(user123) % 5 could result in:
→ Server 3 (not 2 anymore!)
Result:
Almost all keys are remapped to different servers — massive data movement, cache misses, and inefficiency.
This is why we need consistent hashing.
🔄 What is Consistent Hashing?
Consistent hashing is a key distribution algorithm that maps both data and servers onto the same hash ring.
Instead of assigning keys to servers based on modulo, it places both keys and nodes on a circular “hash space” (usually from 0 to 2³²-1).
Keys are assigned to the closest server in a clockwise direction.
🔁 Visualizing the Hash Ring
Imagine a circle numbered from 0 to MAX_HASH (e.g., 2³²).
Each server is assigned a position on the circle via a hash function.
Each key is also hashed and placed on the circle.
A key belongs to the first server clockwise from its position.
So even if a server is added or removed, only a small subset of keys needs to be re-mapped.
📌 How It Works – Step by Step
Hash Function:
Both servers and data keys are hashed using the same function.
Placement on Circle:
All hash outputs are mapped to points on a circle (e.g., 0 to 2³²).
Assigning Keys:
Each key is stored in the first server found when moving clockwise from the key’s position on the circle.
Server Add/Remove:
When a new server is added, only the keys between the new server and its predecessor are reallocated.
The rest of the keys stay untouched.
🧪 Example
Assume servers A, B, and C are placed at positions:
A → hash(A) = 100
B → hash(B) = 300
C → hash(C) = 600
Key X → hash(X) = 350
So, X maps to Server C, the next clockwise node.
Now if you add Server D at hash = 400:
Only keys between 300 and 400 get reassigned to D.
Key X still goes to C — minimal disruption.
🎯 Why Consistent Hashing Is Useful
Scalable: Add or remove nodes with minimal rehashing.
Load Balancing: Evenly distributes keys (with help of virtual nodes).
Fault Tolerant: Easy to redistribute keys when nodes fail.
Used in Real Systems: DynamoDB, Cassandra, Redis Cluster, Akamai, Memcached.
🧩 Virtual Nodes (VNodes)
One downside of basic consistent hashing is that servers might be unevenly distributed on the hash ring.
To solve this, we introduce Virtual Nodes:
Each physical server is assigned multiple virtual positions on the ring.
When a server joins or leaves, the load is spread more evenly.
Virtual nodes improve balancing and reduce hot spots.
Example:
Server A has 3 virtual nodes at hash positions: 101, 205, 999
Each vnode is treated as a separate node in key assignment.
⚙️ Implementation Details
Hash Function: Use a uniform and fast hash like SHA-1 or MurmurHash.
Data Replication: Often, data is stored on multiple successor nodes for redundancy.
Rebalancing: Only small portions of data move during topology changes.
📦 Real-World Use Cases
🧵 Distributed Caches (e.g., Memcached, Redis Cluster)
Ensures cache keys are distributed fairly, and cache misses are minimized during scaling.
🌩️Dynamo-Style Databases
Amazon Dynamo, Cassandra, and Scylla use consistent hashing for partitioning and replication.
🧭Content Delivery Networks (CDNs)
Route users to the right edge node based on consistent hash of content or IP.
🧠 Interview Tips
In a system design interview, if you’re asked to:
Build a distributed cache
Design a database sharding strategy
Distribute data among multiple storage nodes
…then bringing up consistent hashing (and why you’d use it) shows strong architectural thinking.
👉 Mention concepts like:
“Only K/N keys are re-mapped on server change.”
“We’d use virtual nodes for better balance.”
“Hash ring allows us to scale out gracefully.”
✅ Summary
Consistent hashing is a cornerstone of scalable, fault-tolerant distributed systems. It allows systems to:
Avoid complete key reshuffling
Balance load across servers
Recover quickly from node failures
Grow without significant re-architecture
If you’re working on a backend system that requires data distribution or partitioning — this is a concept you’ll use or be asked to design
Subscribe to my newsletter
Read articles from Yashveer Singh directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
