CAP Theorem Simplified in 5 Minutes

The CAP theorem was introduced by computer scientist Eric Brewer in 2000. It is widely used in distributed computing to guide the design and architecture of systems that need to handle data across multiple locations. By understanding the CAP theorem, developers can make informed decisions about which properties to prioritize based on their specific use case and requirements.

In the era of distributed computing, the CAP theorem is a fundamental principle. It helps us understand the trade-offs involved in designing distributed systems. The CAP theorem states that a distributed data store can only provide two out of the following three guarantees: Consistency, Availability, and Partition Tolerance. This means that when designing a system, you must choose which two of these properties are most important, as achieving all three simultaneously is impossible.

To break it further:

C stands for Consistency: All users must see the same data at the same time.
A stands for Availability: Every request is served with a response.
P stands for Partition Tolerance: Systems should work despite any failure between two nodes.

Generally speaking, no company wants their system to fail. Therefore, partition tolerance is essential. The big challenge is choosing between availability and consistency, which means deciding whether to risk providing incorrect data or to STOP serving data.

Fig1: Image credits: Hello interview-SWE interview preparation

Suppose we consider two users in different locations, the USA and Europe. User A sends a write request to change his profile photo to the USA server, and User B, who is in Europe, sends a read request to view User A's profile to the European server. In such scenarios, the USA server usually has replicas in Europe. Now, if there is any failure in both servers, and if availability is chosen, then User B will not see the updated profile picture but can still use the system as it remains online. If consistency is chosen, then User B cannot access the system or will receive an error message when trying to access User A's profile photo. Each has its own trade-offs, and choosing the right one depends on the type of action performed by the system.

Imagine User A wants to book a ticket on a ticket booking platform. User A successfully books a specific seat number. Now, if User B tries to book a ticket for the same seat from different location and two servers fail, User B should not be able to access the seat already booked by User A. In this case, consistency is crucial because all users must see the same data at the same time, rather than just keeping the system available.

The CAP theorem applies not only to entire systems but also to different parts of the same system, which makes it powerful for complex systems

Example: Tinder

Availability - for profile picture reads
Consistency - for swipes from different locations

In conclusion, the CAP theorem highlights the trade-offs between consistency, availability, and partition tolerance in distributed systems. It is essential to choose the right balance based on the specific needs of your application. For instance, ensuring consistency is crucial for actions like booking tickets, where all users need to see the same data simultaneously. On the other hand, availability might be prioritized for less critical operations, like reading profile pictures. Understanding and applying CAP principles can help design more effective and resilient systems.

Thank you!

Stay inspired.

Understanding CAP Theorem in Just 5 Minutes

Subscribe to my newsletter

Varun_inspire

Varun_inspire