Scaling Simplified

Scaling referes to the ability of a software system to handle more requests by increasing the underlying infrastructure without hampering the performance of the application.

Let’s consider an example. Suppose you open a hotel with an initial capacity of 10 rooms. Since you are new in business, you don’t have many guests visiting your property and hence, your operations are run smoothly with this capacity.

Fast forward 6 months, now you are picking up pace and more guests come in on regular basis which causes delays in room availability. Theres’s chaos with guests queued in waiting leading to cacellations. At this point, your obvious step would be to increase capacity in order to handle the increase in demand.

To increase capacity, you can add more rooms to your hotel and in turn get more reservations. Say, instead of 10, you now have 20 rooms. This is called vertical scaling i.e. when the capacity of the given instance is increased to meet increased demand.

This setup will work efficiently until the demand increases again. With increase in demand, more rooms should be added. However, there’s a hard limit to the amount of load the foundation of your building can bear.

In order to solve this problem, you can add another building with 20 rooms and call it Wing B. The initial building will be Wing A. This is called horizontal scaling i.e. increasing capapcity by adding more instances similar to the existing one. In this way, all guests can be accomodated efficiently without putting exccessive load on the building foundation.

Now that the concept has been established, lets move on to the formal definition and key differences between horizontal and vertical scaling.

HORIZONTAL SCALINGVERTICAL SCALING
It refers to adding more instances along with the current instance to increase overall compute, memory and perforamnceIt refers to increasing the compute, memory and performance of the current instsance by replacing it with a bigger instance.
Load balancing is compulsory to distribute load evenly throughout all instances.Since a single instance is used, load balancing is not required.
It is a resilient architecture as if one server is down, the load is redistributed among the other servers and typically there is zero downtime.It is not fault tolerant as server is single point of contact and hence if the server goes down, the whole application will come to a hault.
Communication b/w various components is slower as it uses Remote Procedural calls.Most of the communication is inter-process in nature and hence is faster.
Data consistency is an issue as each server has its own copy of the database.As only one server is in use, data is always consistent.
It scales well with increased number of users.It can’t scale beyond the hardware limit of the server.
1
Subscribe to my newsletter

Read articles from Prabhsimran Bajaj directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Prabhsimran Bajaj
Prabhsimran Bajaj