AWS Architecture: Scaling & Load Balancing Tips

Introduction

In today’s cloud-driven world, ensuring your applications are highly available, scalable, and fault-tolerant is crucial. AWS provides powerful services like Elastic Load Balancing (ELB) and EC2 Auto Scaling to help distribute traffic efficiently and automatically adjust capacity based on demand.

This is a continuation guide from our previous article on Build Your First Secure AWS VPC: Public/Private Subnets, NAT Gateways & Web Servers!

We are going to build on top of that architecture

https://hashnode.com/post/cmcw43tur000l02jvfwk98y14

The architecture we are going to have before getting our hands on Scaling and load balancing is below

The architecture we are going to build is below

In this hands-on lab, we’ll:
✔ Convert an existing EC2 instance into an AMI for consistent deployments.
✔ Set up an Application Load Balancer (ALB) to distribute traffic.
✔ Create an Auto Scaling Group (ASG) to automatically scale instances.
✔ Test scaling policies by simulating traffic spikes.

🔹 Task 1: Creating an AMI for Auto Scaling

Since we want new instances to match our existing web server, we’ll create an Amazon Machine Image (AMI).

Open the EC2 Dashboard → Instances.
Select Web Server 1 (running instance).
Click Actions → Image and templates → Create image.
Configure:
- Image name: Web Server AMI
- Description: Lab AMI for Web Server
Click Create image.

✅ Why? This ensures every new instance launched by Auto Scaling has the same setup.

🔹 Task 2: Creating a Load Balancer

We’ll use an Application Load Balancer (ALB) to distribute traffic across instances.

Navigate to EC2 → Load Balancers → Create Load Balancer.
Choose Application Load Balancer.
Configure:
- Name: LabELB
- VPC: Lab VPC
- For the first Availability Zone, choose Public Subnet 1.
- For the second Availability Zone, choose Public Subnet 2.
- Security Group: Web Security Group (allows HTTP traffic).

4. Under Listeners & Routing, click Create target group.

Target type: Instances
Name: lab-target-group
Click Next → Create target group.
Back in the ALB setup, refresh and select lab-target-group.
Click Create load balancer.

📌 Note: Copy the DNS name of the ALB (we’ll use it later).

✅ Why? The ALB ensures traffic is evenly distributed and checks instance health.

🔹 Task 3: Creating a Launch Template

Auto Scaling needs a launch template to know how to configure new instances.

Go to EC2 → Launch Templates → Create launch template.
Configure:
- Name: lab-app-launch-template
- For Template version description, enter A web server for the load test app
- For Auto Scaling guidance, choose Provide guidance to help me set up a template that I can use with EC2 Auto Scaling.
- AMI: Select the Web Server AMI (created earlier).
- Instance type: t3.micro
- Security Group: Web Security Group
Click Create launch template.

✅ Why? This defines the "blueprint" for new instances.

🔹 Task 4: Creating an Auto Scaling Group

Now, we’ll set up Auto Scaling to manage instance count dynamically.

Choose lab-app-launch-template, and then from the Actions dropdown list, choose Create Auto Scaling group
Configure:
- Name: Lab Auto Scaling Group
- Choose Next.
- VPC: Lab VPC
- From the Availability Zones and subnets dropdown list, choose Private Subnet 1 (10.0.1.0/24) and Private Subnet 2 (10.0.3.0/24).
- Choose Next.
In the Attach to an existing load balancer section, configure the following options. Choose Choose from your load balancer target groups. Attach to the existing load balancer (lab-target-group).
Set Health check type: ELB (ensures only healthy instances receive traffic).Choose Next.
Configure scaling:
- Desired capacity: 2
- Minimum capacity: 2
- Maximum capacity: 4
Enable Target tracking scaling policy:
- Metric: Average CPU utilization
- Target value: 50% (scales if CPU exceeds this).
- Change the Target value to 50
Add a tag (Name: Lab Instance).
Click Create Auto Scaling group.

✅ Why? Auto Scaling ensures we always have enough instances to handle traffic.

🔹 Task 5: Verifying Load Balancing Works

Let’s confirm traffic is being distributed correctly.

Go to EC2 → Instances → Check two new Lab Instance instances.
In Target Groups, verify both instances show Healthy.
Open a browser, paste the ALB DNS name, and see the Load Test app.

✅ Success! The ALB is routing traffic to the instances.The Load Test application should appear in your browser, which means that the load balancer received the request, sent it to one of the EC2 instances, and then passed back the result.

🔹 Task 6: Testing Auto Scaling

Now, we’ll simulate high traffic to trigger scaling.

In the Load Test app, click Load Test (spikes CPU usage).
Go to CloudWatch → Alarms → Monitor the AlarmHigh alarm.
- After ~5 minutes, it should change to In alarm (CPU > 50%).
Check EC2 Instances → New instances should launch (up to 4).

✅ Why? Auto Scaling adds instances when demand increases!

Conclusion

By using ELB + Auto Scaling, we’ve built a self-healing, scalable infrastructure that:
✔ Distributes traffic efficiently.
✔ Automatically scales under load.
✔ Maintains high availability across multiple AZs.

AWS Architecture: Best Practices for Scaling and Load Balancing