CAP Theorem

Yeshwanth Chintaginjala
2 min readNov 1, 2020

Understanding CAP Theorem

CAP Theorem:

Any distributed system can only guarantee two among the three properties such as Consistency, Availability, and Partition Tolerance.

Consistency: All the nodes see the same data at the same time.

  1. Every read must receive the last write or an error.
  2. When there is a write transaction all the nodes to be updated with the latest data, which you can do by locking all the nodes before allowing the further reads which definitely has a certain delay

E.g: for suppose you booked your movie ticket from BookMyShow which is written into one node in a distributed system, that seat is blocked for you, someone trying to access the same seat through Paytm from another node in a distributed system, it should show the seat is blocked. there should not be any inconsistency in the data or else you both will end up fighting for the same seat in the movie theater which you already paid for. Alas!

RDBMS systems such as Oracle, Postgres, MYSQL, etc are consistent

Availability: Every request gets a non-error response even in case of failure/success.

This can be achieved by replicating data across multiple nodes.

E.g: The no of views/likes on youtube videos need not always be consistent. But the data needs to be available even in case of node failure there needs to be a response from the available node instead of waiting for the failed node to come up.

Partition Tolerance: Despite Partial failure /loss of message in case of any amount of network failure it should not lead to the failure of entire network failure.

  1. In case of network failure, we can trade-off between the system being consistent/Always available. If you think the system needs to be consistent. Then wait until the failed network/node to come up and update the data across all the nodes and then the further reads be allowed. If you can compromise on consistency we can allow the system to read from any available nodes which guarantee availability over consistency.
  2. One solution to make the system more consistent in this scenario is to stop serving requests from out of date partitions and serve only from the latest partitions
  3. It is not possible to design a distributed system that guarantees both consistency and availability in case of network failure.

E.g: in the case of BookMyShow if a network failure occurs we can’t allocate the same seat to more than one person we choose consistency over availability here, in the case of view count/like count of youtube if a network failure occurs we can choose availability over consistency by reading data from any of the available node. here we go for eventual consistency over consistency which focuses on availability more.

--

--