🔁 System Design Series: Design Consistent Hashing

Yashveer SinghYashveer Singh
4 min read

🚀 Introduction

In distributed systems, we often need to distribute data across multiple machines or servers. But what happens when a new server is added or removed?

A naive hash function would require almost complete data reallocation, which is extremely inefficient at scale.

This is where consistent hashing comes in — a beautifully elegant technique used by large-scale systems like Amazon Dynamo, Cassandra, and Redis Cluster to maintain balance and minimize re-distribution when nodes change.

In this post, we’ll break down:

  • What consistent hashing is

  • Why it matters in system design

  • How it works conceptually

  • Real-world use cases and benefits

  • Trade-offs and design extensions

🧠 The Problem with Traditional Hashing

Let’s say you use the simplest way to distribute keys:

key % number_of_servers

If you have 4 servers (0, 1, 2, 3), the key user123 might go to:

hash(user123) % 4 → Server 2

But what if you add a 5th server?

Now, hash(user123) % 5 could result in:

→ Server 3 (not 2 anymore!)

Result:

Almost all keys are remapped to different servers — massive data movement, cache misses, and inefficiency.

This is why we need consistent hashing.

🔄 What is Consistent Hashing?

Consistent hashing is a key distribution algorithm that maps both data and servers onto the same hash ring.

Instead of assigning keys to servers based on modulo, it places both keys and nodes on a circular “hash space” (usually from 0 to 2³²-1).

Keys are assigned to the closest server in a clockwise direction.


🔁 Visualizing the Hash Ring

Imagine a circle numbered from 0 to MAX_HASH (e.g., 2³²).

  • Each server is assigned a position on the circle via a hash function.

  • Each key is also hashed and placed on the circle.

  • A key belongs to the first server clockwise from its position.

So even if a server is added or removed, only a small subset of keys needs to be re-mapped.


📌 How It Works – Step by Step

  1. Hash Function:

    Both servers and data keys are hashed using the same function.

  2. Placement on Circle:

    All hash outputs are mapped to points on a circle (e.g., 0 to 2³²).

  3. Assigning Keys:

    Each key is stored in the first server found when moving clockwise from the key’s position on the circle.

  4. Server Add/Remove:

    When a new server is added, only the keys between the new server and its predecessor are reallocated.

    The rest of the keys stay untouched.


🧪 Example

Assume servers A, B, and C are placed at positions:

  • A → hash(A) = 100

  • B → hash(B) = 300

  • C → hash(C) = 600

Key X → hash(X) = 350

So, X maps to Server C, the next clockwise node.

Now if you add Server D at hash = 400:

Only keys between 300 and 400 get reassigned to D.

Key X still goes to C — minimal disruption.


🎯 Why Consistent Hashing Is Useful

  1. Scalable: Add or remove nodes with minimal rehashing.

  2. Load Balancing: Evenly distributes keys (with help of virtual nodes).

  3. Fault Tolerant: Easy to redistribute keys when nodes fail.

  4. Used in Real Systems: DynamoDB, Cassandra, Redis Cluster, Akamai, Memcached.


🧩 Virtual Nodes (VNodes)

One downside of basic consistent hashing is that servers might be unevenly distributed on the hash ring.

To solve this, we introduce Virtual Nodes:

  • Each physical server is assigned multiple virtual positions on the ring.

  • When a server joins or leaves, the load is spread more evenly.

  • Virtual nodes improve balancing and reduce hot spots.

Example:

Server A has 3 virtual nodes at hash positions: 101, 205, 999

Each vnode is treated as a separate node in key assignment.


⚙️ Implementation Details

  • Hash Function: Use a uniform and fast hash like SHA-1 or MurmurHash.

  • Data Replication: Often, data is stored on multiple successor nodes for redundancy.

  • Rebalancing: Only small portions of data move during topology changes.


📦 Real-World Use Cases

🧵 Distributed Caches (e.g., Memcached, Redis Cluster)

Ensures cache keys are distributed fairly, and cache misses are minimized during scaling.

🌩️Dynamo-Style Databases

Amazon Dynamo, Cassandra, and Scylla use consistent hashing for partitioning and replication.

🧭Content Delivery Networks (CDNs)

Route users to the right edge node based on consistent hash of content or IP.


🧠 Interview Tips

In a system design interview, if you’re asked to:

  • Build a distributed cache

  • Design a database sharding strategy

  • Distribute data among multiple storage nodes

…then bringing up consistent hashing (and why you’d use it) shows strong architectural thinking.

👉 Mention concepts like:

  • “Only K/N keys are re-mapped on server change.”

  • “We’d use virtual nodes for better balance.”

  • “Hash ring allows us to scale out gracefully.”


✅ Summary

Consistent hashing is a cornerstone of scalable, fault-tolerant distributed systems. It allows systems to:

  • Avoid complete key reshuffling

  • Balance load across servers

  • Recover quickly from node failures

  • Grow without significant re-architecture

If you’re working on a backend system that requires data distribution or partitioning — this is a concept you’ll use or be asked to design

0
Subscribe to my newsletter

Read articles from Yashveer Singh directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Yashveer Singh
Yashveer Singh