Cloud Guardian: Building a Robust AWS Monitoring Solution with EC2, CloudWatch, and SNS

Monitoring cloud infrastructure is critical for performance, cost optimization, and security. Cloud Guardian is an open-source project that showcases how to use AWS EC2, CloudWatch, and SNS to monitor CPU utilization and send real-time alerts. This blog walks you through the project’s implementation, complete with interactive code snippets, a CPU utilization chart, and ways to engage with the project. Whether you’re a DevOps engineer, cloud enthusiast, or developer, Cloud Guardian offers a practical blueprint for AWS monitoring.
Explore the project on GitHub: Cloud Guardian
Project Overview
Cloud Guardian is an open-source project designed to monitor AWS EC2 instance performance, specifically CPU utilization, and send instant notifications via AWS SNS when thresholds are breached. By using Terraform for infrastructure provisioning and a Python script for testing, the project showcases a practical approach to cloud resource management. The core components include:
AWS EC2: A t2.micro instance for running workloads.
AWS CloudWatch: Real-time monitoring of CPU utilization with customizable alarms.
AWS SNS: Email notifications triggered when CPU usage exceeds 50%.
Terraform: Infrastructure-as-code for automated EC2 setup.
Python Testing Script: A
cpu_spike.py
script to simulate CPU load and validate alerts.
The project is hosted on GitHub at Cloud Guardian, though the repository. Let’s explore how Cloud Guardian was built and how you can replicate it.
Caption: The Cloud Guardian dashboard displaying real-time CPU utilization metrics.
Why Cloud Guardian?
Monitoring cloud resources is critical for maintaining performance, optimizing costs, and ensuring security. Cloud Guardian addresses these needs by:
Automating EC2 instance provisioning with Terraform.
Providing real-time insights into CPU utilization via CloudWatch.
Delivering instant alerts through SNS for proactive incident response.
Enabling stress testing with a Python script to verify monitoring accuracy.
This project is ideal for developers learning AWS, DevOps practitioners implementing monitoring solutions, or businesses aiming to enhance cloud infrastructure reliability.
Implementation Journey
1. Setting Up the EC2 Instance
The foundation of Cloud Guardian is an AWS EC2 t2.micro instance, provisioned using Terraform. The setup includes:
Instance Creation: Launched a t2.micro instance, a cost-effective choice for testing.
Security Groups: Configured to allow SSH access for secure management.
Detailed Monitoring: Enabled CloudWatch detailed monitoring for granular metrics.
The Terraform configuration files (main.tf
, variables.tf
, provider.tf
, and terraform.tfvars
) define the infrastructure, ensuring reproducibility and scalability.
resource "aws_instance" "instance_log" {
ami = var.ami_id
instance_type = var.instance_type
key_name = var.key_name
# Security group for SSH access
vpc_security_group_ids = [aws_security_group.ssh_sg.id]
tags = {
Name = "cloud-guardian-instance"
Project = "Cloud Guardian"
}
}
Caption: Screenshot of the AWS EC2 console showing the configured t2.micro instance.
2. Integrating CloudWatch for Monitoring
AWS CloudWatch is the backbone of Cloud Guardian’s monitoring capabilities. Key steps included:
Metrics Collection: Configured CloudWatch to track CPU utilization in real time.
Alarm Setup: Created an alarm to trigger when CPU usage exceeds 50%.
SNS Integration: Linked the alarm to an SNS topic for email notifications.
This setup ensures that any performance anomalies are detected and communicated instantly.
Caption: CloudWatch dashboard displaying CPU utilization metrics and alarm status.
3. Configuring SNS Notifications
AWS SNS (Simple Notification Service) enables real-time email alerts. The process involved:
Creating an SNS topic in the AWS Console.
Subscribing an email address to the topic and confirming the subscription.
Linking CloudWatch alarms to the SNS topic to trigger notifications when CPU usage exceeds the 50% threshold.
This ensures stakeholders are promptly informed of potential issues, enabling quick resolution.
Caption: AWS SNS console showing the configured topic and email subscription.
4. Testing and Validation
To verify the monitoring system, Cloud Guardian includes a Python script (cpu_spike.py
) to simulate CPU load. The testing process was:
Initial Check: Confirmed CloudWatch was collecting metrics under normal conditions.
CPU Spike Test: Ran
cpu_spike.py
to generate a CPU load (e.g., 80% for 30 seconds).Validation: Verified that CloudWatch detected the spike, triggered the alarm, and sent an SNS notification.
The script supports customizable parameters:
python cpu_spike.py --duration 60 --cpu-percent 80
This command simulates a 80% CPU load for 60 seconds, ideal for testing high-load scenarios.
Caption: Graph showing CPU utilization spike during testing, with the alarm triggered.
Project Structure
The Cloud Guardian repository is organized for clarity and ease of use:
Cloud-Guardian/
├── EC2/
│ ├── main.tf # Terraform configuration for EC2
│ ├── variables.tf # Variable definitions
│ ├── terraform.tfvars # Variable values
│ ├── provider.tf # AWS provider settings
│ ├── README.md # EC2-specific documentation
├── default_metrics_demo/
│ └── cpu_spike.py # Python script for CPU load testing
└── README.md # Project overview
This structure separates infrastructure code from testing scripts, making it easy to manage and extend.
Key Features
Automated Provisioning: Terraform scripts streamline EC2 setup.
Real-Time Monitoring: CloudWatch tracks CPU utilization, memory, disk, and network metrics.
Instant Alerts: SNS delivers email notifications for CPU spikes.
Secure Access: SSH-enabled with security group configurations.
Testing Framework:
cpu_spike.py
validates monitoring and alerting reliability.
How to Get Started
Prerequisites
Tools: Terraform (v1.0.0+), AWS CLI, Git.
AWS Requirements: Active AWS account, IAM permissions, access keys, verified email for SNS.
Setup Guide
Clone the Repository:
git clone https://github.com/amitkumar-Github8/Cloud-Guardian.git cd Cloud-Guardian
Configure AWS CLI:
aws configure
Enter your AWS Access Key ID, Secret Access Key, region, and output format (e.g., JSON).
Set Up SNS:
Navigate to the AWS SNS Console.
Create a new topic and subscribe your email.
Confirm the subscription via the email received.
Deploy Infrastructure:
cd EC2 terraform init terraform plan terraform apply
Run CPU Spike Test:
cd default_metrics_demo python cpu_spike.py --duration 30 --cpu-percent 80
[Insert Image Here: Terraform Apply Output]
Caption: Terminal output of terraform apply
showing successful EC2 deployment.
Monitoring Results
The project successfully demonstrated:
CPU Tracking: CloudWatch captured real-time CPU utilization.
Alert Triggering: Alarms activated at 50% CPU usage.
Notification Delivery: SNS emails were sent promptly.
Real-Time Validation: Metrics and alerts functioned as expected.
Caption: CloudWatch alarm state transitioning to "ALARM" during a CPU spike.
Technologies Used
AWS EC2: aws.amazon.com/ec2
AWS CloudWatch: aws.amazon.com/cloudwatch
AWS SNS: aws.amazon.com/sns
Terraform: terraform.io
Python: For the
cpu_spike.py
testing script.
Why This Matters
Cloud Guardian is more than a proof-of-concept; it’s a practical blueprint for monitoring cloud infrastructure. By automating provisioning, monitoring, and alerting, it reduces manual overhead and ensures rapid response to performance issues. This project is particularly valuable for:
Learning AWS: Understand EC2, CloudWatch, and SNS in a real-world context.
DevOps Practices: Implement infrastructure-as-code and automated monitoring.
Cost Optimization: Identify and address resource overuse early.
Future Enhancements
To extend Cloud Guardian, consider:
Adding more metrics (e.g., memory, disk space) to CloudWatch.
Supporting additional notification channels (e.g., SMS, Slack).
Integrating with AWS Security Hub for enhanced security monitoring.
Automating multi-instance monitoring for larger deployments.
Conclusion
Cloud Guardian showcases the power of AWS services in building a robust monitoring solution. By combining EC2, CloudWatch, SNS, and Terraform, it provides a scalable, automated approach to cloud resource management. Whether you’re a beginner exploring AWS or a seasoned engineer optimizing infrastructure, this project offers valuable insights and a replicable framework.
Explore the project on GitHub: amitkumar-Github8/Cloud-Guardian. Star the repository if you find it useful, and feel free to contribute or open issues for support!
[Insert Image Here: Cloud Guardian Architecture Diagram]
Caption: Diagram illustrating the flow from EC2 to CloudWatch to SNS notifications.
About the Author: Amit Kumar is a cloud enthusiast and developer passionate about AWS and infrastructure automation. Follow my work on GitHub or connect on Hashnode for more cloud projects and tutorials.
Resources:
Subscribe to my newsletter
Read articles from Amit Kumar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
