AWS Architecture: Best Practices for Scaling and Load Balancing

Introduction
In today’s cloud-driven world, ensuring your applications are highly available, scalable, and fault-tolerant is crucial. AWS provides powerful services like Elastic Load Balancing (ELB) and EC2 Auto Scaling to help distribute traffic efficiently and automatically adjust capacity based on demand.
This is a continuation guide from our previous article on Build Your First Secure AWS VPC: Public/Private Subnets, NAT Gateways & Web Servers!
We are going to build on top of that architecture
https://hashnode.com/post/cmcw43tur000l02jvfwk98y14
The architecture we are going to have before getting our hands on Scaling and load balancing is below
The architecture we are going to build is below
In this hands-on lab, we’ll:
✔ Convert an existing EC2 instance into an AMI for consistent deployments.
✔ Set up an Application Load Balancer (ALB) to distribute traffic.
✔ Create an Auto Scaling Group (ASG) to automatically scale instances.
✔ Test scaling policies by simulating traffic spikes.
🔹 Task 1: Creating an AMI for Auto Scaling
Since we want new instances to match our existing web server, we’ll create an Amazon Machine Image (AMI).
Open the EC2 Dashboard → Instances.
Select Web Server 1 (running instance).
Click Actions → Image and templates → Create image.
Configure:
Image name:
Web Server AMI
Description:
Lab AMI for Web Server
Click Create image.
✅ Why? This ensures every new instance launched by Auto Scaling has the same setup.
🔹 Task 2: Creating a Load Balancer
We’ll use an Application Load Balancer (ALB) to distribute traffic across instances.
Navigate to EC2 → Load Balancers → Create Load Balancer.
Choose Application Load Balancer.
Configure:
Name:
LabELB
VPC:
Lab VPC
For the first Availability Zone, choose Public Subnet 1.
For the second Availability Zone, choose Public Subnet 2.
Security Group:
Web Security Group
(allows HTTP traffic).
4. Under Listeners & Routing, click Create target group.
Target type:
Instances
Name:
lab-target-group
Click Next → Create target group.
Back in the ALB setup, refresh and select
lab-target-group
.Click Create load balancer.
📌 Note: Copy the DNS name of the ALB (we’ll use it later).
✅ Why? The ALB ensures traffic is evenly distributed and checks instance health.
🔹 Task 3: Creating a Launch Template
Auto Scaling needs a launch template to know how to configure new instances.
Go to EC2 → Launch Templates → Create launch template.
Configure:
Name:
lab-app-launch-template
For Template version description, enter
A web server for the load test app
For Auto Scaling guidance, choose Provide guidance to help me set up a template that I can use with EC2 Auto Scaling.
AMI: Select the Web Server AMI (created earlier).
Instance type:
t3.micro
Security Group:
Web Security Group
Click Create launch template.
✅ Why? This defines the "blueprint" for new instances.
🔹 Task 4: Creating an Auto Scaling Group
Now, we’ll set up Auto Scaling to manage instance count dynamically.
Choose lab-app-launch-template, and then from the Actions dropdown list, choose Create Auto Scaling group
Configure:
Name:
Lab Auto Scaling Group
Choose Next.
VPC:
Lab VPC
From the Availability Zones and subnets dropdown list, choose Private Subnet 1 (10.0.1.0/24) and Private Subnet 2 (10.0.3.0/24).
Choose Next.
In the Attach to an existing load balancer section, configure the following options. Choose Choose from your load balancer target groups. Attach to the existing load balancer (
lab-target-group
).Set Health check type:
ELB
(ensures only healthy instances receive traffic).Choose Next.Configure scaling:
Desired capacity:
2
Minimum capacity:
2
Maximum capacity:
4
Enable Target tracking scaling policy:
Metric:
Average CPU utilization
Target value:
50%
(scales if CPU exceeds this).Change the Target value to
50
Add a tag (Name:
Lab Instance
).Click Create Auto Scaling group.
✅ Why? Auto Scaling ensures we always have enough instances to handle traffic.
🔹 Task 5: Verifying Load Balancing Works
Let’s confirm traffic is being distributed correctly.
Go to EC2 → Instances → Check two new Lab Instance instances.
In Target Groups, verify both instances show Healthy.
Open a browser, paste the ALB DNS name, and see the Load Test app.
✅ Success! The ALB is routing traffic to the instances.The Load Test application should appear in your browser, which means that the load balancer received the request, sent it to one of the EC2 instances, and then passed back the result.
🔹 Task 6: Testing Auto Scaling
Now, we’ll simulate high traffic to trigger scaling.
In the Load Test app, click Load Test (spikes CPU usage).
Go to CloudWatch → Alarms → Monitor the
AlarmHigh
alarm.- After ~5 minutes, it should change to In alarm (CPU > 50%).
Check EC2 Instances → New instances should launch (up to 4).
✅ Why? Auto Scaling adds instances when demand increases!
Conclusion
By using ELB + Auto Scaling, we’ve built a self-healing, scalable infrastructure that:
✔ Distributes traffic efficiently.
✔ Automatically scales under load.
✔ Maintains high availability across multiple AZs.
Subscribe to my newsletter
Read articles from Salome Githinji directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
