Understand Shuffle Sharding to make Reliable Services

In a typical client-server architecture, requests from the client are forwarded to a random Application Server by the Application Load Balancer. When the request is a poison pill i.e. it crashes/hungs the server receiving the request, then all of your application servers will eventually crash/hung one by one if the client keeps on retrying (as the requests will be eventually forwarded to all the servers randomly).

Architecture Diagram with Application Load Balancer and eight worker nodes marked unavailable

Source

Will Simple Sharding help?

One way to isolate this problem is to create virtual shards i.e. isolate requests coming from clients so that they are served only by some specific instances. Generally this is done based on some hashing e.g. modulo / consistent.

In the below diagram, all the requests originating from client names starting from A or B goes to Worker1 and Worker2. So if Alpha request is a poison pill, only two servers will get impacted. Remaining servers will be unaffected as the requests never reach there.

Do note that the Bravo request won't be fulfilled either because both Worker1 and Worker2 have crashed/hung due to the poison pill request from Alpha. One dirty fish has poisoned the entire lake.

This strategy has definitely reduced the failure radius but still requests from Bravo are not being served. Can we do better?

Architecture Diagram with Application Load Balancer, shards and flow of customer to worker

Source

Can increasing the shards help?

The issue with the above was the way we created shards — there were simply too few combinations available — as each instance can be mapped to only one shard.

If we allow each instance to be mapped to multiple shards, then we can increase the number of combinations available and reduce our unavailability.

In the below diagram, we created 8 shards to map 8 clients. Previously we only had 4 shards.

Architecture Diagram with Application Load Balancer, eight shards with two worker nodes per shard, and each worker being assigned to two different shards

Source

Note that each workers are mapped to multiple shards e.g. Worker2 is mapped to Shard1 and Shard2.

Now if requests from Alpha crashes/hungs up Worker1 and Worker2 — requests from Bravo would still continue to work as they are mapped to Worker2 and Worker3. Since Worker3 is still alive, requests from Bravo would still get served.

Customer Name	Workers
Alpha	Worker-1 and Worker-2
Bravo	Worker-2 and Worker-3
Charlie	Worker-3 and Worker-4
Delta	Worker-4 and Worker-5
Echo	Worker-5 and Worker-6
Foxtrot	Worker-6 and Worker-7
Golf	Worker-7 and Worker-8
Hotel	Worker-8 and Worker-1

Let’s Shuffle!

Hope you were able to grasp the fundamentals using the above example. The idea is to create more shards with as few overlaps as possible. Let’s see how we can make a generic solution.

Given n application servers, if we randomly choose k instances and make them part of one virtual shard, then each shard would have k instances. The probability of 2 shards with 100% overlap would drastically go down as we increase the value of k.

To simplify, if we have 10 cards and we want to choose 4 cards among them, then there will be a total of 10 choose 4 combinations = 210 total combinations. If we randomly generate 2 such combinations, then the probability of them having 100% overlap, i.e., all 4 cards being the same, would be (1 / 210) ~= 0.47%.

Now to relate this analogy to our problem statement, given 10 application instances, if we randomly choose 4 instances to create a virtual shard, the probability of 2 shards with 100% overlap would be ~0.47%. This indicates that a poison request can only impact 0.47% of shards. We can further reduce this by increasing the total number of instances or increasing the shard size.

Talk is cheap, show me the code!

Shuffle sharding can be implemented in two ways — stateless or stateful.

Stateless, as the name indicates, does not persist any state data, i.e., shard information in a DB Store. It simply identifies the target application instances from an identifier. The target application instances are the members of the virtual shard mapped to that request.

Stateful sharding goes a bit further and persists the information of the shards to a database, which allows further customization of the way shards are created, e.g., customizing the shard assignment strategy by tuning weights.

Let’s see how we can implement stateless shuffle sharding using a simple strategy of generating multiple hashes from a unique identifier, followed by mapping that hash to a unique node.

public Set<Integer> assignNodes(String customerId) {
    Set<Integer> assignedNodes = new HashSet<>();

    // Need to find 4 nodes
    for (int i = 0; i < 4; i++) {
        // Create a unique input for each Node selection
        String hashInput = customerId + ":" + i;
        // maps a hash to a NodeId
        int nodeId = hashToNodeId(hashInput); 

        // Handle collisions by trying the next available Node
        while (assignedNodes.contains(NodeId)) {
            nodeId = (nodeId + 1) % config.getTotalNodes();
        }
        assignedNodes.add(nodeId);
    }
    return assignedNodes;
}

We can also leverage multiple hash functions, similar to a Bloom Filter, to generate multiple hashes instead of generating multiple inputs from the identifier, in case multiple unique inputs can't be generated.

Shuffle Sharding is a very powerful and versatile technique which is not just used by AWS but in other popular Open Source Projects like Grafana Loki and Grafana Mimir. Thank you for reading. Hope you learnt something new. If you have any questions, please do comment.

Shuffle Sharding - the secret sauce behing AWS reliability

Will Simple Sharding help?

Can increasing the shards help?

Let’s Shuffle!

Talk is cheap, show me the code!

Appendix

Subscribe to my newsletter

Snehasish Roy

Snehasish Roy