Save AWS Costs with These Drop-In Alternatives

Ted O'ConnorTed O'Connor
6 min read

Built from experience, not autocomplete.

Let's face it — AWS is a masterpiece of technical engineering... and financial misdirection. If you've ever woken up to a surprise cloud bill that looked more like a car payment, you're not alone. The good news? You can get 80% of the value for 20% of the cost — if you know where to look. This is true whether you're a solo hobbyist just playing around or managing your company's monthly six figure spend. Like all technical solutions, these are not one-size-fits-all. Your requirements might dictate the more expensive option.

In this post, I'll walk through practical, battle-tested drop-in replacements that can dramatically cut your AWS bill — no platform architecting or vendor migrations required. Just plain old smart DevOps.


☁️ 1. Replace NAT Gateway with an EC2 NAT Instance

The problem: NAT Gateway seems cheap until it isn't. You're billed per GB and per hour, and it adds up fast. The costs vary by region but are typically around $0.05/hour. This comes out to $36/month even if not being used at all. If you are just hacking together some stuff on your personal account this could be one of the largest costs. At production levels the hourly cost isn't the issue but the bandwidth costs are. This is because you still pay AWS egress costs but now also have NAT Gateway ingress and egress costs on top of that.

The fix: Roll your own NAT instance using a t4g.nano (ARM-based, dirt cheap) or t3.nano instance and enable IP forwarding. Even two larger instances can cost less than the gateway when you have higher bandwidth needs. AWS used to have AMIs published for running NAT instances since they predate NAT Gateways. But those have reached their end-of-life. Creating your own NAT instance isn't too hard. Here are the official AWS docs.

Why this works:
For modest traffic, a NAT instance performs just fine. You lose the "managed" feel but regain control and visibility — and can save hundreds per month when uptime isn't critical.

Caveat: You'll need to monitor and patch the instance, or at least run a cron job to reboot it occasionally like it's 2008. You will probably want an autoscaling group to keep one or two instances always up.

AI might tell you NAT Gateway is "the best practice" because it's in the docs. But your wallet doesn't care about best practices — it frequently cares about good enough.


📄 2. Send VPC Flow Logs to S3, Not CloudWatch

The problem: CloudWatch is great until you accidentally log everything and forget to rotate.

The fix: Point VPC Flow Logs to S3 and parse them later with Athena or open-source tools.

Why this works:
You get cheap storage with lifecycle policies, no ingestion costs, and better query control via Athena. While VPC Flow Logs can be critical to diagnosing problems they are typically one of those things you setup once and hope to never look at again. So it makes sense to prioritize cheaper storage, even if it's a bit harder to query — you'll rarely need it anyway.

Caveat: There's a delay in availability, and Athena has a learning curve if you're used to "tail -f". Athena also has costs associated with it. So if you are regularly querying your Flow Logs it might not be the best fit.

AI might suggest sticking with CloudWatch for real-time monitoring. That's great if you need it. But most of us don't need real-time logs of port 443 traffic every millisecond.


🏷 3. Use Spot Instances Where You Can (Even in Prod)

The problem: You're paying on-demand rates for workloads that don't demand anything.

The fix: Use EC2 Spot Instances with fallback groups or EKS managed node groups with spot overrides.

Why this works:
Spot instances can be up to 90% cheaper, and AWS gives decent advance notice before reclaiming them. For batch jobs, autoscaling fleets, or anything stateless — they're a no-brainer. These are cheaper than on-demand even when using reserved instances or savings plans.

Example: Here is a Terraform example of an aws_autoscaling_group that uses a mix of spot instances.

resource "aws_autoscaling_group" "example" {
  capacity_rebalance  = true
  desired_capacity    = 5
  max_size            = 10
  min_size            = 2
  vpc_zone_identifier = [aws_subnet.example1.id, aws_subnet.example2.id]

  mixed_instances_policy {
    instances_distribution {
      on_demand_base_capacity                  = 0
      on_demand_percentage_above_base_capacity = 25
      spot_allocation_strategy                 = "price-capacity-optimized"
    }

    launch_template {
      launch_template_specification {
        launch_template_id = aws_launch_template.example.id
      }

      override {
        instance_type     = "c4.large"
        weighted_capacity = "3"
      }

      override {
        instance_type     = "c3.large"
        weighted_capacity = "2"
      }
    }
  }
}

WARNING: The default value for spot_allocation_strategy is lowest-price which AWS does not recommend using. I agree completely as it can result in overly frequent evictions. You'll most likely want to use price-capacity-optimized instead. Here is more info on the various allocation strategies.

Caveat: You might go months without a spot eviction… until suddenly all your instances start shutting down in a row. This can be mitigated by using a mix of instance types and also having on-demand capacity as a backup. Don't run your monolith's database on a Spot instance unless you like chaos.

AI might shrug and call Spot 'unreliable' — but it depends on your region, instance type, and workload. A human would check Spot Advisor and use autoscaling wisely.


📦 Bonus: Ditch Some Managed Services

Sometimes the best drop-in replacement is dropping the service altogether:

  • AWS Systems Manager Session Manager vs. Bastion hosts? Great! But if you're the only engineer, maybe SSH with MFA is enough.

  • Elastic File System (EFS)? Overkill for many workloads. A local volume or S3-backed cache may do the trick.

  • CloudWatch Alarms? Try VictoriaMetrics or Grafana Cloud with a push gateway.

The key isn't to abandon managed services. It's to use the right tool at the right scale — and know when you're being charged convenience tax.


👀 Final Thoughts

Saving on AWS is less about heroics and more about knowing the defaults are often the expensive path. And here's the rub: AI will rarely warn you. Most LLMs don't have your billing alerts or historical cost spikes in context. They'll happily suggest "best practices" that sound great in theory but quietly eat your budget alive.

That's why this blog is written by a human. A human who has made the mistakes, seen the charges, and lived to optimize another day.

So go ahead — outsmart the defaults.


Want more human-powered DevOps hacks like this? Follow along. I promise not to recommend Kubernetes for every problem. ✍️ About the author
I'm Ted, a cloud infrastructure and blockchain engineer focused on practical DevOps and cost optimization strategies.

📬 Want more like this? Subscribe to my newsletter or follow me on Hashnode and Medium.

☕ Found this helpful? Buy me a coffee

0
Subscribe to my newsletter

Read articles from Ted O'Connor directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ted O'Connor
Ted O'Connor