[61]

Recently, I started learning and understanding more about system design and distributed systems. You can read the first article in the series here.

Scalability

A service is said to be scalable if it results in increased performance in a manner proportional to resources added. Increasing performance in general means serving more units of work, but it can also be to handle larger units of work, such as when datasets grow. To simply put, the ability to handle more requests by buying more machines is called scalability.

Scalability of a system cannot be an after-thought, applications and platforms should be designed with scaling in mind, such that adding resources actually results in improving the performance. Many algorithms that perform reasonably well under lower load and smaller datasets can explode in cost if either request rates increase, the dataset grows or the number of nodes in the distributes system increase.

Performance vs/ Scalability

If you have a performance problem, your system is slow for a single user.

If you have a scalability problem, your system is fast for a single user but slower under heavy load.

Vertical Scaling

Vertical scaling (aka scaling up) describes adding additional resources to a system so that it meets demand. It is used to optimize processes and increase throughput using the same resources.

Couple of advantages of vertical scaling are that maintenance becomes cheaper and complexity reduces because of the number of nodes that needs to be managed. Disadvantages include higher possibility of downtime as you will require some considerable time to upgrade. There is also the infamous risk of single point of failure, because having all your operations on a single server increases the risk of losing all your data if a hardware or software failure occurs. There will also be a limitation to how much you can upgrade your machine as every machine has it’s threshold for RAM, storage and processing power.

Examples

Adding or reducing the CPU or memory capacity of the existing virtual machine is an example of vertical scaling. Vertical scaling is commonly seen in the field of virtualization. For instance, a virtual machine hosting a critical application may require more resources as the application’s demand grows. Instead of deploying additional VMs, which could complicate the infrastructure and management, the organization might opt to allocate more CPU cores, memory or disk space to the existing VM.

When it comes to topographic distribution or say if you plan to nationwide or global clients, it is unreasonable to expect them all to access your services from a single machine in a single location. In a situation like this, you’ll need to horizontally scale your resources to maintain your SLA.

Horizontal Scaling

Horizontal scaling (aka scaling out) refers to adding additional nodes on machines to your infrastructure to cope with new demands. If you are hosting an application on a server and find that it no longer has the capacity to handle traffic, adding a new server may be your solution. It is quite similar to delegating workload among several employees instead of one. It requires a distributed architecture. It offers lower failure resilience because other machines in the cluster provide backup in contrast to vertical scaling which has higher failure resilience since it is a single point of failure.

Horizontal scaling is a lot easier from a hardware perspective. All horizontal scaling requires you to do is add extra machines to your current pool. It can eliminate the need to analyze which system specification you need to upgrade. It also increases performance, as it increases the number of endpoints for connections to manage your network traffic, considering the load will be delegated among multiple machines. Having said that, there are some caveats to this as well. Maintaining multiple servers is tougher than maintaining a single server. Additionally, you might need to add software for load balancing and possibly virtualization. Also, adding new servers is far more expensive than upgrading the old ones.

Another point to note here is that load balancing might be required in horizontal scaling and not applicable to vertical scaling as there is no load to balance as such if there is only one machine.

Examples

Horizontal scaling comes up often when it comes to web server architecture. Companies like Facebook and Google use this scaling method to handle giant user bases by distributing the load across multiple servers. Google’s search engine infrastructure, for instance, spans thousands of servers distributed across data centers worldwide, each handling a fraction of the total search queries. By doing so, the performance of each search query can be enhanced and it provides Google with a failover mechanism in the event something goes wrong with one of their servers.

Cloud-based applications also utilize horizontal scaling. Platforms like AWS and Azure offer auto-scaling capabilities, adding or removing instances based on demand. An e-commerce site on AWS can dynamically scale out its web servers during a holiday sale to accommodate traffic surges and scale back down afterward to save costs. Another example could be, adding or reducing the number of virtual machines in a cluster of virtual machines.

The importance of scalability in system design

Table of contents

Scalability

Performance vs/ Scalability

Vertical Scaling

Examples

Horizontal Scaling

Examples

Subscribe to my newsletter

Pranav Bawgikar

Pranav Bawgikar