Zero to million users


This system is not a scalable because after sometimes size of the storage will be needed more then we have to decouple server and storage.
There are two types of scaling :-
Vertical scaling :-
Adding more CPU’s to current server or increasing the capacity of database but always there is a limit for it.
Horizontal scaling :-
Addition of multiple servers or databases into the pool of resources. If one of the server or database goes down others can take up the request to process.
We mostly use horizontal scaling but the issue is which server will take up the request from the specific user ??
Here, comes the concept of load balancer !!
It receives input from the user and sends it to the server having the lesser load. The server processes it and sends the response to the load balancer, and the load balancer forwards it to the user.
One advantage of using a load balancer is that the user only knows the IP of the load balancer, not the internal servers. This helps prevent security breaches or cyberattacks.
Similarly, databases are also horizontally scaled because if there is only one database and it crashes for any reason, the server will not be able to retrieve the data. So, horizontal scaling is applied to databases, which is slightly different from server scaling.
Among all the instances, one will be the Master DB (handles only write requests), and the others will be Slave DBs (used only for read queries). Data is synchronized between the master and slave DBs.
This process is called Database Replication, which provides better performance, reliability, and high availability.
Here too, there's ambiguity about where the server should send the request — i.e., to which database. The solution is, once again, to use another load balancer.
If the master DB crashes, one of the slave DBs is automatically promoted to become the new master DB for write queries. There are various algorithms used to determine which slave instance should be promoted to master.
Querying the database for each user request is a costly operation and increases the response time. To reduce this, we can set up a cache between the server and the database.
The cache usually stores data in the form of key-value (K-V) pairs. Whenever the server requires data, it first checks in the cache. If the data is not found, it then queries the database. After fetching, the data is stored in the cache for future requests.There are different types of caches. When designing the system, you need to consider aspects like eviction policy, expiration policy, consistency requirements, etc., before using a cache.
CDN - Content Delivery Network
Cache helps reduce response time, but for users at distant locations, the response time can still be high. This is where a CDN (Content Delivery Network) comes in.
A CDN is a geographically distributed set of servers used for delivering static content. CDN servers can store static files like images, videos, HTML, JavaScript, etc. When a user visits a website, the request is sent to the CDN server geographically nearest to the user.
If the CDN has the requested data, it serves it directly. Otherwise, it fetches it from the main server. Once retrieved, this data is stored in the CDN server for future requests.
Shared Session Storage
Data inside CDN servers includes an HTTP header called TTL (Time To Live), which defines how long the data is valid in the CDN.
Platforms like Netflix rarely crash, because their main servers never get overloaded. But platforms like WhatsApp, LinkedIn, and Instagram use sessions to process requests.
This means that whenever a user sends a POST request, the system first checks whether the user is logged in. This session data needs to be stored somewhere temporarily, for as long as the session is active. If it is stored on only one server, then in case of a crash, the session data would be lost.
To handle this, we use shared session storage, which stores only session-related information. Maintaining such a shared session storage is referred to as a storage architecture, allowing any server to independently process user requests. Now, user requests are no longer bound to a single server.
This shared session storage can be an RDBMS, cache, or NoSQL database.
Generally, NoSQL databases are preferred because they are easier to scale and store session data efficiently.
Subscribe to my newsletter
Read articles from ROHIT SINGH directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

ROHIT SINGH
ROHIT SINGH
Passionate about learning new tecnlogies and tools and exploring the world of Technology.