Navigating AWS Cost Optimization: An In-Depth Guide (Day-12)

Rohit DeoreRohit Deore
8 min read

In today's fast-paced digital landscape, leveraging cloud computing effectively while controlling costs is a balancing act many organizations strive to achieve. AWS cost optimization plays a pivotal role in this endeavor, offering a pathway to reduce cloud expenses without compromising on performance or scalability. This guide delves into the essence of AWS cost optimization, exploring its significance, methodologies, benefits, potential drawbacks, and provides an illustrative example, and a hands-on demo to bring these concepts to life.

What is AWS Cost Optimization?

AWS cost optimization is the process of adjusting your usage and resources on Amazon Web Services (AWS) to lower costs while maintaining efficiency and meeting your requirements. It involves analyzing your current resource utilization, identifying areas of waste or inefficiency, and implementing strategies to reduce expenses, such as selecting more cost-effective resource types, utilizing reserved instances, or automating scaling to match demand.

Why is it Important?

In the cloud, where resources are billed based on consumption, costs can quickly spiral if not monitored and managed carefully. AWS cost optimization ensures that you're only paying for what you need, helping to maximize your return on investment (ROI) in the cloud. It allows organizations to free up budget for other strategic initiatives, improve operational efficiencies, and maintain a competitive edge in their respective markets.

How to Optimize AWS Costs

1. Right-Sizing Resources: Analyze your resource utilization to ensure you're using the appropriate types and sizes for your workloads. Downsizing or upgrading to more efficient types can reduce costs significantly.

2. Using Reserved Instances and Savings Plans: Committing to reserved instances or savings plans for services like Amazon EC2 and RDS can offer substantial discounts over on-demand pricing.

3. Implementing Auto-Scaling: Use AWS Auto Scaling to automatically adjust the number of resources in use based on demand, ensuring you're not paying for idle resources.

  1. Monitoring and Reporting with AWS CloudWatch and Cost Explorer: Regularly monitor usage and costs with tools like AWS CloudWatch and AWS Cost Explorer. These tools can help identify trends, set alarms for unexpected cost spikes, and uncover optimization opportunities.

5. Deleting Unused Resources: Regularly review and terminate or stop resources that are no longer in use, such as old instances, unattached volumes, or obsolete snapshots.

Advantages

  • Cost Reduction: The most direct benefit is a reduction in AWS bills, allowing funds to be allocated to other areas.

  • Improved Efficiency: By optimizing resource usage, you're ensuring that you're getting the most out of what you're paying for.

  • Enhanced Scalability: Cost optimization strategies like auto-scaling make it easier to handle workload fluctuations without manual intervention.

Disadvantages

  • Complexity: Navigating the multitude of pricing models and services in AWS can be overwhelming.

  • Time Investment: Analyzing and implementing optimization strategies requires an upfront time investment.

  • Risk of Under-Provisioning: Overzealous cost-cutting could lead to under-provisioning, potentially impacting performance or availability.

Example

A web application running on Amazon EC2 instances experiences variable traffic. Implementing auto-scaling groups for these instances ensures that during peak times, additional instances are automatically launched to handle the load, and during quiet periods, excess instances are terminated. This dynamic adjustment keeps the application responsive while minimizing costs.

Demo

We'll create a Lambda function that identifies EBS snapshots no longer associated with any active EC2 instance and deletes them to save on storage costs.

Lambda Function Working: The Lambda function retrieves all EBS snapshots owned by the same account ('self') and also gathers a list of active EC2 instances (both running and stopped). For each snapshot, it verifies if the associated volume (if it exists) is not linked to any active instance. If it identifies a stale snapshot, it deletes it, effectively optimizing storage costs.

Let's start the demo.

  1. Create an EC2 instance, and we will create snapshots from the volume of the EC2 instance.

  2. You can check if there is a volume created for the EC2 instance by going to the EC2 dashboard and clicking on 'Volumes'. There, you will be able to see the volume created for the instance we just created.

  3. Go to the EC2 dashboard, click on Snapshots, and create the snapshots.
    - What is Snapshots: In Amazon Web Services (AWS), snapshots are point-in-time copies of Amazon Elastic Block Store (EBS) volumes. These snapshots capture the exact state and data of an EBS volume at the moment the snapshot is initiated. Snapshots are stored in Amazon Simple Storage Service (S3) and are used for backups, to replicate data across regions, or to scale vertically by creating new EBS volumes from snapshots.

    Now select the volume.Scroll down and click on "Create Snapshot".

    Snapshot is created.

    Here, we have created a snapshot, and, for example, a developer wants to delete the EC2 instance, volumes, and snapshots. However, they only delete the EC2 instance and forget about the snapshot. They just terminate the EC2 instance, and although the volume gets deleted, the snapshots remain as they are. In such cases, we can use a Lambda function.

  4. Navigate to the Lambda functions and create a function for cost optimization. Click on 'Create function'.

    Provide the function name and select the Python runtime.Click on "Create Function"

  5. Copy the code provided below and paste it into the 'Code Source' section in lambda_function.py.

     import boto3
    
     def lambda_handler(event, context):
         ec2 = boto3.client('ec2')
    
         # Retrieve all EBS snapshots
         response = ec2.describe_snapshots(OwnerIds=['self'])
    
         # Fetch all running EC2 instance IDs
         instances_response = ec2.describe_instances(Filters=[{'Name': 'instance-state-name', 'Values': ['running']}])
         active_instance_ids = set()
    
         for reservation in instances_response['Reservations']:
             for instance in reservation['Instances']:
                 active_instance_ids.add(instance['InstanceId'])
    
         # Loop through each snapshot and delete if not attached to any volume or if the volume is not attached to a running instance
         for snapshot in response['Snapshots']:
             snapshot_id = snapshot['SnapshotId']
             volume_id = snapshot.get('VolumeId')
    
             if not volume_id:
                 # If the snapshot is not attached to any volume, delete it
                 ec2.delete_snapshot(SnapshotId=snapshot_id)
                 print(f"Deleted EBS snapshot {snapshot_id} because it was not attached to any volume.")
             else:
                 # Verify if the volume exists
                 try:
                     volume_response = ec2.describe_volumes(VolumeIds=[volume_id])
                     if not volume_response['Volumes'][0]['Attachments']:
                         ec2.delete_snapshot(SnapshotId=snapshot_id)
                         print(f"Deleted EBS snapshot {snapshot_id} because it originated from a volume not attached to any running instance.")
                 except ec2.exceptions.ClientError as e:
                     if e.response['Error']['Code'] == 'InvalidVolume.NotFound':
                         # If the volume associated with the snapshot does not exist (might have been deleted)
                         ec2.delete_snapshot(SnapshotId=snapshot_id)
                         print(f"Deleted EBS snapshot {snapshot_id} because its associated volume was not found.")
    

    Click on "Deploy" and then click on "Test" to trigger the function.

    When you hit the 'Test' button, a window to configure the test event will pop up. Just give the event a name and save it.

    Click on 'Test' again to trigger the function.

    Error will occur that "Task timed out after 3.03 seconds"

    Go to the Configuration tab and edit the Timeout setting to 10 seconds.
    In General Congiguration Click on Edit.

    Set to 10 seconds and Save

  6. Hit "Test" button again to trigger function and you will see an error.
    The error message indicates that the IAM role cost-optimization-function-role-10vh6cs1 used by your AWS Lambda function (or any other service assuming this role) does not have the necessary permissions to perform the ec2:DescribeSnapshots action. To resolve this issue, you need to attach a policy granting the ec2:DescribeSnapshots permission to the role.

    Now, go to IAM and click on "Policies".

    Click on 'Create Policy'

    Select the EC2 service, search for snapshots, and choose 'DescribeSnapshots'.

    Also, search for 'DeleteSnapshot' and select it, as we need this for the deletion of a snapshot. Then, in resources, select 'All' if not selected by default.

    Click the "Next" button.
    Then enter a name for the policy.

    Click the "Create policy" button.

    You can see that the policy has been created.

    Now, return to the Lambda function and navigate to the "Configuration" tab. Select "Permissions," and then, under "Execution role," locate "Role name." Below it, you will find a link. Simply click on that link.

    Here, go to 'Add permissions'.

    Select "Attach policies"

    Then search for the policy you just created, select it, and then click the 'Add permissions' button.

    Now, go ahead and click the "Test" button in the Lambda function. You will see an error stating, "When calling the DescribeInstances operation: You are not authorized to perform this operation." To resolve this, you need to grant the necessary permissions by creating a new policy.

    Navigate to policies and create one for DescribeInstances and DescribeVolumes.

    Now, go back to the Lambda function and enter the Configuration tab. Select Permissions, then click the link below 'Role name'.

    Policy was successfully attached to role.
    Now, go back to the Lambda function and trigger it using the "Test" button.

    You can see that it executed successfully and found nothing to delete.

    Please go and check if that snapshot has been deleted. You will find that it has not been deleted because the EC2 instance is still running.

  7. Now, go ahead and terminate the EC2 instance. After doing so, check for the snapshot; it will still be present. Terminating an EC2 instance only deletes that instance and the volume attached to it.

    Snapshot is not deleted.

    Now, go and trigger the Lambda function by clicking the 'Test' button.

    See the function logs it indicates that, "Deleted EBS snapshot snap-0aa45e3267e46ee83 because its associated volume was not found."

    Go and check for snapshots; you will see that there are no snapshots.

    The demo is now completed.
    In practice, you can create hundreds of snapshots, and they will all be deleted in one go if they are not associated with a volume.

This is how cost optimization occurs for various resources in AWS.

On Closing

Embracing AWS cost optimization is not just about cutting costs; it's about investing wisely in the cloud to ensure sustainable growth and competitiveness. By understanding and implementing the strategies outlined in this guide, organizations can enjoy the full spectrum of benefits offered by AWS without the burden of unnecessary costs. Remember, cost optimization is an ongoing process, requiring continuous review and adjustment as your needs evolve. Start optimizing today, and transform your AWS expenditure into a strategic asset for tomorrow.


Keep Exploring...

0
Subscribe to my newsletter

Read articles from Rohit Deore directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Rohit Deore
Rohit Deore

Student and Developer