What is Scaling?

Scaling in AWS is a core concept that enables businesses to flexibly adjust their infrastructure as needed. Leveraging AWS's scalable architecture, organizations can dynamically allocate resources for optimal performance and cost efficiency. Whether managing surges in traffic, scaling to accommodate growth, or optimizing resource use during quiet periods, AWS scaling ensures agility and responsiveness. With a suite of scalable services, AWS empowers organizations to easily meet evolving demands and maintain competitive edge.

Scaling Up / Vertical Scaling

Vertical scaling in AWS, also referred to as scaling up or down, entails adjusting the resources allocated to a single instance to accommodate fluctuations in workload or demand. This involves modifying the instance size, such as increasing or decreasing CPU, memory, or storage capacity. Vertical scaling enables you to enhance the performance or capacity of individual instances without altering the underlying architecture or adding additional instances. Despite providing immediate capacity adjustments, vertical scaling has limitations regarding how much an instance can be scaled vertically, and it may not always be the most cost-effective approach for managing significant increases in workload. Additionally, if an instance fails, it poses a single point of failure, which can disrupt the system.

Scaling Out / Horizontal Scaling

Horizontal scaling in AWS, also referred to as scaling out or in, entails modifying the number of instances or resources within a group to manage fluctuations in workload or demand. This approach allows for workload distribution across multiple instances, enhancing fault tolerance, availability, and performance. It provides the flexibility to meet increasing demands without depending on a single, large instance and can offer cost advantages over vertical scaling in certain scenarios. Horizontal scaling ensures that your application or service remains responsive and available, even during peak traffic or demand periods.

What is Autoscaling and its types?

Auto Scaling in AWS automatically modifies the number of instances or resources within a group based on predefined conditions or metrics. This functionality allows your infrastructure to dynamically adjust to changes in workload or demand without requiring manual intervention. Auto Scaling ensures that you maintain optimal performance and availability by providing the appropriate amount of resources when needed, while also optimizing costs by scaling down during periods of reduced demand. When used alongside horizontal scaling, Auto Scaling distributes the workload across multiple instances, enhancing fault tolerance and resilience. Overall, Auto Scaling is an essential feature for building scalable and reliable applications on AWS.

The visual display highlights two Availability Zones designated for the Auto Scaling group. Utilizing CloudWatch logs, we can dynamically adjust the instance count to accommodate fluctuations in traffic, scaling up during peaks and scaling down during lulls. Furthermore, the Auto Scaling group automatically replaces any failed instances, ensuring fault tolerance.

Here are the different types of autoscaling:

Manual - Adjusts ASG size manually.
Dynamic - Scales automatically based on demand.
Predictive - Utilizes machine learning for prediction.
Scheduled - Scales according to a predetermined scheduled time.

Hands-On : Creating an Autoscaling Group

Here's the User Data code that can be copied and used in the launch template for all instances,

#!/bin/bash

# Update package repository and install Apache
yum update -y
yum install -y httpd

# Start and enable Apache service
systemctl start httpd
systemctl enable httpd

# Retrieve the Availability Zone of the EC2 instance
EC2AZ=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone)

# Create HTML content with the Availability Zone information
echo "<center><h1>This Amazon EC2 instance is located in Availability Zone: $EC2AZ </h1></center>" > /var/www/html/index.html

Choose the launch template and provide a name for it.

Choose the Linux operating system from the available AMIs and select the t2.micro instance type, which is eligible for the free tier.
Choose the instance type as t2.micro and do not provide the key-value pair.
Choose the security group that has access to the web server.
In the advanced section, scroll down to input the user data and proceed to create the launch template.

Now that we have created a launch template for launching instances with the Auto Scaling group, let's proceed to create the Auto Scaling group that utilizes this launch template.

Click on the Auto Scaling group and create one with the already created launch template.
Choose the default VPC and select the availability zones us-east-1a and us-east-1b.
No load balancer is opted at this time.
Choose the desired capacity, indicating the number of instances you want to be running at all times. The minimum and maximum desired capacities denote the minimum and maximum number of instances to be running within the ASG.
Review the changes made to your summary and proceed to create the auto-scaling group (ASG). Under the activity tab of the created ASG, we can monitor ongoing activities and status updates.
Now, instances are being created based on the input desired capacity.
To test the auto-scaling group (ASG), we will forcibly terminate one instance. The expected behavior is that a new instance should be created automatically, as we have already configured the required capacity.
Now, activity logs containing traces of new instance additions during a failover are visible.

This is how auto-scaling groups function in response to network failures or increased demand, ensuring the system remains available for use.

Mastering AWS Scaling: A Deep Dive into Auto Scaling for Seamless Performance Enhancement