Disclaimer

This article was majorly created with the help of GPT 4o-mini

What's Auto Scaling Groups (ASG)?

Auto Scaling Groups (ASGs) in AWS are a feature of the AWS Auto Scaling service designed to automatically adjust the number of Amazon EC2 instances in response to changing demand. This helps ensure that the right number of instances are running to handle your application’s traffic and workloads efficiently.

Auto Scaling Groups were introduced by AWS to address the growing need for scalable and reliable infrastructure. Since their launch, they have evolved with features such as Elastic Load Balancers (ELB) integration, advanced scaling policies, and support for multiple instance types and sizes. AWS continually updates ASGs with new features to enhance scalability and flexibility.

ASGs automatically manage the number of EC2 instances by scaling out (adding more instances) or scaling in (removing cases) based on predefined policies. They perform health checks on instances, integrate with load balancers, and provide monitoring and notifications for scaling events.

Architecture Structure

Launch Configuration/Template: Defines the EC2 instance settings, such as AMI, instance type, and security groups.
Scaling Policies: Rules that dictate when and how to scale instances based on metrics and alarms.
Health Checks: Mechanisms to monitor and maintain the health of instances.
Load Balancers: Distribute incoming traffic across instances, integrated with ASGs to automatically register and deregister instances.
CloudWatch Metrics: Provide monitoring and alarms for scaling activities.

How ASG Works

Auto Scaling Groups (ASGs) in AWS provide a robust mechanism for managing the number of EC2 instances running your application. They automate the scaling process to ensure that your application remains performant and cost-effective as demand fluctuates. Here’s a comprehensive explanation of how ASGs operate:

Setting up ASG

Launch Configuration or Launch Template:

Launch Configuration: Defines the settings for launching new EC2 instances, including the AMI ID, instance type, key pair, security groups, and other instance-specific parameters.
Launch Template: Offers more flexibility than Launch Configurations, allowing versioning, the specification of multiple instance types, and additional parameters like instance metadata options. It supports more advanced configurations and is recommended for new setups.

Some of the things to consider when setting up ASG include:

Desired Capacity: This is the number of instances the ASG should maintain. The ASG will attempt to keep this number of instances running, scaling out or in as needed to match the desired capacity.
Minimum and Maximum Capacity: Define the minimum and maximum number of instances that the ASG should maintain. The ASG will not scale below the minimum or above the maximum, regardless of the scaling policies or health checks.

ASG Scaling Policies

Scaling policies determine when and how the ASG should adjust the number of instances. There are several types of scaling policies:

Simple Scaling: Executes a specific action (e.g., add or remove a fixed number of instances) based on a CloudWatch alarm.
Step Scaling: Adjust the number of instances based on the severity of the metric breach. For example, high CPU usage might trigger the addition of multiple instances.
Target Tracking Scaling: Maintains a specific metric value (e.g., CPU utilization at 50%) by dynamically adjusting the number of instances.

CloudWatch Alarms: Alarms are set to monitor specific metrics. When a metric breaches a defined threshold, the alarm triggers the corresponding scaling policy. Metrics can include CPU utilization, network traffic, or custom metrics specific to your application.

ASG Health Checks

Instance Health Checks: The ASG performs regular health checks to ensure instances are functioning properly. Health checks can be:

EC2 Status Checks: AWS’s built-in checks that verify the instance’s ability to communicate with the AWS network and its health.
Custom Health Checks: Defined by the user, these could include application-level checks or other criteria specific to your environment.

Replacement of Unhealthy Instances: If an instance fails a health check, the ASG automatically terminates the unhealthy instance and launches a new one to replace it. This ensures that the desired capacity and performance are maintained.

ASG Scaling Actions

Scaling Out (Adding Instances): When the ASG determines that more capacity is needed (e.g., due to high traffic or a breached alarm threshold), it launches new EC2 instances based on the launch configuration/template. These instances are added to the ASG and, if integrated with a load balancer, are registered to distribute traffic.

Scaling In (Removing Instances): When the demand decreases (e.g., low traffic or a breached alarm threshold indicating excess capacity), the ASG terminates excess instances. Instances are selected for termination based on termination policies, ensuring that the minimum required capacity is maintained.

ASG Integration with Load Balancers

Registration with Load Balancers: New instances launched by the ASG are automatically registered with the associated Elastic Load Balancer (ELB) or Application Load Balancer (ALB). This ensures that incoming traffic is distributed across all healthy instances.

Deregistration of Terminated Instances: When instances are terminated, they are automatically deregistered from the load balancer, preventing traffic from being routed to instances that are no longer operational.

ASG Monitoring and Notifications

CloudWatch Metrics: ASGs rely on CloudWatch to provide real-time metrics and monitoring. Metrics include instance health, resource utilization, and performance indicators.

Notifications: AWS SNS (Simple Notification Service) can be configured to send notifications about scaling activities, such as when instances are launched or terminated. This helps in monitoring and responding to scaling events.

ASG Termination Policies

Termination Policies: These determine how instances are selected for termination when scaling in. Common policies include:

Oldest Launch Configuration: Terminates instances with the oldest launch configuration.
Newest Instances: Terminates the newest instances first, which can be useful if they are considered less stable.

Closest to Next Billing Hour: Terminates instances that are closest to the end of the current billing hour, potentially saving on costs.

ASG Use Cases

Handling Traffic Spikes: Automatically scale out during traffic spikes to maintain performance and scale in during low traffic to reduce costs.
Cost Management: Optimize resource usage by scaling in unused resources, ensuring you only pay for what you need.
High Availability: Maintain application availability by replacing failed instances automatically and distributing traffic across healthy instances.
Scheduled Scaling: Adjust the number of instances based on known traffic patterns, such as increasing capacity before a major event.

ASG Examples

Consider an e-commerce website that experiences heavy traffic during sales events. Here’s how ASG can manage this scenario:

Setup: You create an ASG with a launch template specifying the desired instance type and settings.
Scaling Policy: Set up a policy to add more instances if CPU utilization exceeds 70% for 5 minutes.
Health Checks: The ASG monitors instances and replaces any that fail health checks.
Scaling Action: During a sale, CPU utilization spikes. The scaling policy triggers and ASG launches additional instances to handle the increased load.
Load Balancer Integration: New instances are automatically registered with the ELB, ensuring traffic is distributed evenly.
Scaling In: After the sale ends, CPU utilization drops. The ASG scales in by terminating excess instances to save on costs.

How Auto Scaling Groups (ASGs) Work