Distributed Storage: Modulo to Consistent Hashing

🔥 Ever tried scaling a single database past its limits? You’ll quickly encounter massive rebalance storms and downtime. While a single-server setup might handle initial workloads easily, expanding to tens or hundreds of millions of users demands distributed storage, bringing unique challenges to data management.

⚙️ Design Goals for Distributed Systems

Uniform distribution: Avoid hotspots by evenly spreading data across nodes.
High throughput: Scale horizontally for fast reads and writes.
Elasticity: Add or remove nodes without disrupting service.
Resilience: Handle node failures, network partitions, and unpredictable workloads.

To achieve this, we need a mechanism that quickly determines data placement and minimizes shard rebalancing during cluster changes. Let's explore the need for consistent hashing with an illustrative example.

🛒 Use Case: Shopping Cart Service in a Global E-Commerce Site

Imagine you are building the shopping cart service for a global e-commerce site. Initially, with just a few thousand users, everything is simple and fits into a single node, as illustrated in Figure 1.

Your business is a success, and now the user base is growing rapidly along with the data. This rapid growth soon necessitated data distribution across multiple nodes—a process known as sharding, as shown in Figure 2.

You store the information in a fast key-value store to ensure that actions like “Add to cart” and “Checkout” are quick and responsive. Here’s how a record may look in a key-value store:

Key   = "<user_id>"
Value = 
{
  "userId": "827391",
  "items": [
    {"productId": "SKU12345", "quantity": 2, "unitPrice": 25.99}
  ],
  "lastUpdated": "2025-05-27T14:32:15Z",
  "currency": "USD"
}

To scale, it’s necessary to shard the data, specifically user_id in our case. But how does the system decide the mapping between the shard ID and the server node? In other words, how do I know where to put my data?

🔢 Modulo Hashing

Hashing deterministically converts variable‑length inputs into fixed‑size numeric values. A strong hash is fast, uniformly distributes outputs to avoid hotspots, and has a low collision rate—though it’s one‑way, so you can’t reverse it to recover the original data. Good hashing sets the stage for efficient sharding.

Let’s start our use case with a 3-node cluster (N0, N1, N2). Modulo hashing efficiently determines data placement:

nodes = ["N0", "N1", "N2"]
N = len(nodes)  # 3
db_node_id = nodes[hash(user_id) % N]

# Example assignments:
# user_id=42  -> hash(42) % 3 = 0 → nodes[0] →  "N0"
# user_id=100 -> hash(100) % 3 = 1 → nodes[1] → "N1"
# user_id=107 -> hash(107)% 3 = 2 → nodes[2] →  "N2"

It’s a super-simple lookup. The distribution of shopping cart data among the three nodes in the database cluster is depicted in Figure 3.

Key Definitions

Cluster: A group of server nodes working together to store and serve your data.

Shard Manager: The component (often built into the database or cache) that maps each shard ID to a specific node in the cluster.

While modulo hashing is simple, it suffers significantly when scaling up or down.

➕ Adding a New Node to the cluster

During the peak sale season, a new node is added to take up some load. When adding a 4th node, recalculations shift almost every user’s data, causing extensive rebalancing:

nodes = ["N0", "N1", "N2", "N3"]
N = len(nodes)  # 4
db_node_id = nodes[hash(user_id) % N]

# Example assignments:
# user_id=42  -> hash(42) % 4 = 2 → nodes[2] →  "N2" (shifted)
# user_id=100 -> hash(100) % 4 = 0 → nodes[0] → "N0" (shifted)
# user_id=107 -> hash(107)% 4 = 3 → nodes[3] →  "N3" (shifted)

➖ Removing a Node from the cluster

Say, N0 is removed from the cluster, then the calculation of the new shard mapping will look something like this,

nodes = ["N1", "N2", "N3"]
N = len(nodes)  # 3
db_node_id = nodes[hash(user_id) % N]

# Example assignments:
# user_id=42  -> hash(42) % 3 = 0 → nodes[0] →  "N1" (shifted)
# user_id=100 -> hash(100) % 3 = 1 → nodes[1] → "N2" (shifted)
# user_id=107 -> hash(107)% 3 = 2 → nodes[2] →  "N3" (unchanged)

Similarly, removing a node causes widespread data movement.

Since the calculation of which node the user ID must go to depends on the total number of active nodes in the cluster, the amount of data movement required when a node is added or removed is high. Frequent reshuffling in large systems can be inefficient, leading to downtime and performance hits.

🌐 Consistent Hashing: The Scalable Cure

Consistent hashing is a highly efficient hashing mechanism that is used in many large-scale distributed systems, such as Cassandra and DynamoDB.

Consistent hashing represents the whole key space as a logical ring. The size of the ring is determined by the cluster’s size; for our example, let’s keep it between 0 and 360 for ease of understanding. Each physical node is hashed using a hash function and placed in the corresponding positions on the hash ring as shown in Figure 6.

Just as vehicles choose the nearest exit on a roundabout, consistent hashing picks the ‘closest’ node on the ring to store each key.

So, in a key-value store that uses consistent hashing, the keys are hashed using the same hash function, and the position of the node is the nearest node in the clockwise direction from the position of the key on the ring. If we consider our previous use case of shopping cart data, when we consider the same user IDs, the calculation is slightly changed as shown below.

db_node_id = hash(user_id) % ring_size (ring_size = 360)

#  user_id=42 → hash(42)%360=42 →   Node 0 [right of 42 is N0 at 90]
#  user_id=100 → hash(100)%360=100→  Node 1 [right of 100 is N1 at 220]
#  user_id=107→ hash(107)%360=107→  Node 1 [right of 107 is N1 at 220]

Figure 7 helps to clarify this further.

➕ Adding a New Node to the cluster

Let’s understand with a diagram,

A few things happen when a new node is added as can be seen in Figure 8.

The keys 100 and 107, which were previously part of node N1, are now part of the new node N3, requiring a remapping of keys from Node N1 to N3.
The existing data also needs to be moved from N1 to N3.

➖ Removing a Node from the cluster

Let’s understand with a diagram,

A few things happen when node N0 is removed as can be seen from Figure 9.

The key 42, which was part of node N0, is now part of node N3, requiring a remapping of keys from Node N0 to N3.
The existing data also needs to be moved from N0 to N3.

In both scenarios, there is minimal data movement required to adjust the cluster.

In our case, it’s a simple example, however, it illustrates the concept upon which consistent hashing is built. In our shopping cart use case for an internet-scale global e-commerce site, there can be hundreds of millions of users and thousands of nodes, and the number of shard IDs can be much higher than the number of physical nodes on the ring. This can lead to potential data skewness, resulting in hotspots.

For example, in Figure 9, if there are many keys whose positions on the ring are between 91 and 165, then all those will eventually land on N3, potentially making it a hotspot. Additionally, if N3 goes down, then all the load will shift to N1, which may overload and fail N1, in which case, the existing load on N1 will shift to N2, again overloading N2 and potentially causing the node to fail. This is called cascading failure. In order to circumvent cascading failure and to uniformly distribute data across physical nodes, there is a concept called a Virtual Node.

⚖️Virtual Nodes: Enhancing Balance and Stability

Virtual Node, as the name suggests, is logical and is added to the consistent hashing ring to bring uniformity in data distribution and avoid cascading failures. Virtual nodes are essentially positions on the ring. Let’s see how.

There are three physical nodes in the cluster: N0, N1, and N2.
We create two virtual nodes for each physical node (N0-0, N0-1, etc.).
As a result, there will be more node positions on the hash ring.
This allows for uniformity in data and load distribution, as can be seen in Figure 10, thus reducing the chance of hotspots.
The system maintains a mapping between virtual nodes and physical nodes in the form of a Map Map<VirtualNode, PhysicalNode>.

Key-on-node-edge: If a key’s hash exactly matches a vnode’s position (e.g., key at 90), it maps to that vnode. In our example, a key at 90 lands on the vnode at 90 rather than the next one at 104.

🛠️ Consistent Hashing Prototype

Explore a working Java prototype demonstrating key operations:

// --- SimpleConsistentHashRing.java ---
import java.util.*;

public class SimpleConsistentHashRing {
    private final SortedMap<Integer, String> ring = new TreeMap<>();
    private final int N = 360;  // ring size

    public void addNode(String nodeId) {
        int hash = Objects.hash(nodeId) % N;
        ring.put(hash, nodeId);
    }

    public void removeNode(String nodeId) {
        ring.values().removeIf(id -> id.equals(nodeId));
    }

    public String getNodeForKey(String key) {
        if (ring.isEmpty()) return null;
        int hash = Objects.hash(key) % N;

        SortedMap<Integer, String> tail = ring.tailMap(hash);
        return tail.isEmpty() 
             ? ring.get(ring.firstKey()) 
             : ring.get(tail.firstKey());
    }

    public static void main(String[] args) {
        SimpleConsistentHashRing ring = new SimpleConsistentHashRing();
        ring.addNode("A"); 
        ring.addNode("B"); 
        ring.addNode("C");

        System.out.println("user:1001 → " + ring.getNodeForKey("user:1001"));
        ring.addNode("D");
        System.out.println("user:1001 → " + ring.getNodeForKey("user:1001"));
    }
}

✅ Production Best Practices

Use a large, fixed ring size (e.g., 64-bit). Changing the ring size calls for a complete cluster rebalancing and is not efficient.
Select fast, non-cryptographic hash functions (e.g., Murmur hash)
Allocate sufficient virtual nodes to avoid data skewness and hot spots.

💬 Final Thoughts & Discussion

Consistent hashing excels for stateful distributed storage, offering elasticity, resilience, and minimal rebalancing overhead. Curious about replication or why systems like Kafka can still scale using modulo hashing? Drop your thoughts in the comments below!

📚 Other Articles You May Like

Dive deeper into Write-Ahead Logs in Understanding Write-Ahead Logs: Durability Beyond the Flush

From Modulo to Consistent Hashing: Optimizing Distributed Storage

⚙️ Design Goals for Distributed Systems

🛒 Use Case: Shopping Cart Service in a Global E-Commerce Site

🔢 Modulo Hashing

➕ Adding a New Node to the cluster

➖ Removing a Node from the cluster

🌐 Consistent Hashing: The Scalable Cure

➕ Adding a New Node to the cluster

➖ Removing a Node from the cluster

⚖️Virtual Nodes: Enhancing Balance and Stability

🛠️ Consistent Hashing Prototype

✅ Production Best Practices

💬 Final Thoughts & Discussion

📚 Other Articles You May Like

Subscribe to my newsletter

Subhashish Bhattacharjee (Joy)

Subhashish Bhattacharjee (Joy)