Switching Colors in Production: Blue/Green on ECS

Blue/Green deployments have always felt like a magic trick — two identical environments running in parallel, one live, one in waiting, and with a single switch you move production traffic over without anyone noticing. This month, I implemented exactly that for our FeedbackHub application on AWS ECS Fargate using Terraform.

This is Phase 3 of my AWS-Learning roadmap — and it’s a milestone worth documenting.

🎯 Why Blue/Green?

Zero downtime: Swap environments without taking production offline.
Safe rollback: If the new version (Green) misbehaves, instantly flip back to Blue.
Confidence in release: Test Green in production without risking the main environment.

Before Blue/Green became common, around the 2015–2020 era, we used to create a full replica of production, deploy new code there, run quick smoke tests, and then switch the DNS. It worked, but it was clunky, risky, and stressful. Blue/Green eliminates that hassle.

That’s why I often say: “Before DevOps was a buzzword, I was already scaling Drupal on EC2s. Now I’m just rewriting it with the help of modern tools and techniques!”

🏗 Architecture Overview (Pyramid View)

The setup:

ALB at the top routing all production traffic
Two Target Groups (Blue & Green) beneath it
Two ECS Services (Blue active, Green staging) beneath Target Groups

🔍 Visual: Pyramid Flow

flowchart TD
    UserRequests((User Requests)) --> ALBListener{{ALB Listener}}
    ALBListener --> BlueTG[Blue Target Group - 90% Traffic]
    ALBListener --> GreenTG[Green Target Group - 10% Traffic]
    BlueTG --> BlueService[ECS Blue Service - Active Prod]
    GreenTG --> GreenService[ECS Green Service - Staging Test]
    BlueService -.->|Rollback| ALBListener
    GreenService -.->|Promote| ALBListener

Legend:

User Requests → Incoming traffic from users
ALB Listener → Routes traffic based on weights
Blue Target Group → Current production service (receives majority traffic)
Green Target Group → New version under test (receives minority traffic)
Rollback Path → Instant switch back to Blue if needed
Promote to Prod → Switch Green to full production when stable

⚙️ Implementation Highlights

Terraform modules for clean separation:
- alb → ALB, TGs, listener rules
- ecs/feedbackhub_service → Blue service
- ecs/feedbackhub_service_green → Green service
Weighted rule:

forward {
  target_group {
    arn    = aws_lb_target_group.app.arn # Blue
    weight = 90
  }
  target_group {
    arn    = aws_lb_target_group.feedbackhub_green_tg.arn # Green
    weight = 10
  }
}

Rollout strategy: Shift gradually, monitor CloudWatch logs, rollback if needed

🧪 Testing the Cutover

Deploy new version to Green
Gradually increase traffic weight to Green
Validate health checks & logs in CloudWatch
If stable → shift 100% traffic to Green
If issues → rollback instantly to Blue

Fun fact: Our /green test URL had some Next.js quirks (redirects), so we validated Green by routing traffic through the ALB weighted rules instead — a very real production lesson in handling frontend routing.

📚 Lessons Learned

Blue/Green is less about URLs and more about traffic control at ALB level
Weighted rules are a powerful safety net
Clean Terraform modules make this setup repeatable and maintainable
Always have a rollback command ready

🚀 What’s Next

This completes Phase 3 of my AWS-Learning roadmap. Next up:

Phase 3.5 → AWS Bedrock integration to analyze ECS CloudWatch logs (AI-powered observability!)
Phase 4 → CI/CD pipeline for Blue/Green deployments using GitHub Actions

Stay tuned — we’re bringing AI into the mix next.

💬 Have you implemented Blue/Green with ECS or another platform? What was your biggest lesson?

🔗 Resources & Links

💻 Repo: GitHub Repository

🌐 Live App: FeedbackHub Deployment (May be snoozing to save AWS bill!)

🔗 Connect: LinkedIn Profile (Come for the DevOps talk, stay for the ECS nap jokes!)

Switching Colors in Production: Blue/Green Deployments with ECS + Terraform