GCP to AWS Migration – Part 2: Real Cutover, Issues & Recovery

Table of contents
- 🚀 Start of Part 2: The Real Cutover & Beyond
- ⚙️ Phase 4: Application & Infrastructure Layer Adaptation
- 🌐 Load Balancer Differences: GCP vs AWS
- 🔍 Phase 5: Apache Solr Migration
- 🛑 The Cutover Weekend
- 😮 Unexpected Issues (and What We Did)
- 🚀 Post-Migration Optimizations
- 🧠 End of Part 2: Final Thoughts & What’s Next

🚀 Start of Part 2: The Real Cutover & Beyond
While Part 1 laid the architectural and data groundwork, Part 2 is where the real-world complexity kicked in.
We faced:
Database promotions that didn’t go as rehearsed,
Lazy-loaded Solr indexes fighting with EBS latency,
Hardcoded GCP configs in the dark corners of our stack,
And the high-stakes pressure of a real-time production cutover.
If Part 1 was planning and theory, Part 2 was execution and improvisation.
Let’s dive into the live switch, the challenges we didn’t see coming, and how we turned them into lessons and long-term wins.
⚙️ Phase 4: Application & Infrastructure Layer Adaptation
As part of the migration, significant adjustments were required in both the application configuration and infrastructure setup to align with AWS's architecture and security practices.
Key Changes & Adaptations
Private Networking & Bastion Access
All EC2 instances (except load balancers) were placed in private subnets for enhanced security.
Initial access was via a VPN client → AWS Direct Connect → Bastion Host setup.
Post-migration, the bastion host's public IP was decommissioned, relying solely on secure, internal access.
CORS & S3 Policy Updates
Applications required updates to CORS headers to handle static content requests from a different domain (CloudFront).
S3 bucket policies were reconfigured to allow read access only via CloudFront, blocking direct public access.
Application Configuration Updates
- All environment-specific settings, including
.env
variables, were audited and updated to replace hardcoded GCP endpoints with dynamic, AWS-native configurations (e.g., RDS endpoints, S3 URLs).
- All environment-specific settings, including
Internal DNS Transition
In GCP, internal DNS resolution is automatically managed.
AWS replicated this behavior using Route 53 private hosted zones, ensuring seamless service discovery across private subnets.
Static Asset Delivery via CloudFront
- All requests to static assets in S3 were redirected through Amazon CloudFront, improving performance and reducing latency for global users.
Security Hardening with WAF
Integrated AWS Web Application Firewall (WAF) in front of the CloudFront distribution.
Applied enterprise-grade rules:
Rate limiting to prevent abuse
Geo-blocking and IP filtering based on security policies
DDoS protection via AWS Shield integration
Firewall Rules: Securing AWS MySQL from Legacy Sources
To ensure controlled access to the new MySQL server in AWS, we hardened the instance using explicit iptables
rules. These rules:
Blocked direct MySQL access from legacy or untrusted subnets (e.g., GCP App subnets)
Allowed SSH access only from trusted bastion/admin IPs during the migration window
FIREWALL RULES FLOW:
[Blocked Sources] ──❌──┐
│
10.AAA.0.0/22 ──────┤
10.BBB.248.0/21 ──────┤
├─── DROP:3306 ───┐
│ │
[Allowed Sources] ──✅──┤ ▼
│ ┌─────────────────┐
10.AAA.0.4/32 ──────┤ │ 10.BBB.CCC.223 │
10.BBB.248.158/32 ──────┤ │ MySQL Server │
10.BBB.251.107/32 ──────┼─ACCEPT──│ (AWS) │
10.BBB.253.9/32 ──────┤ :22 │ │
│ └─────────────────┘
Legend:
10.AAA
.x.x
= Source network (GCP)10.BBB.CCC.223
= Target MySQL server in AWSIPs like
10.BBB.248.158
= Bastion or trusted admin IPs allowed for SSH
This rule-based approach gave us an extra layer of protection beyond AWS Security Groups during the critical migration phase.
🌐 Load Balancer Differences: GCP vs AWS
During the migration, we encountered significant differences in how load balancing is handled between GCP and AWS. This required architectural adjustments and deeper routing, SSL, and compute scaling planning.
📊 Comparison Overview
Feature | GCP HTTPS Load Balancer | AWS Application Load Balancer (ALB) |
Scope | Global by default | Regional |
TLS/SSL | Wildcard SSL was uploaded to GCP | Managed manually via AWS Certificate Manager (ACM) |
Routing Logic | URL Maps | Target Groups with Listener Rules |
IP Type | Static Public IP | CNAME with DNS-based routing |
Backend Integration | Global Load Balancer → MIG (Managed Instance Groups) | ALB → Target Group → ASG (Auto Scaling Group) |
🧩 Key Migration Notes
Static IP vs DNS Routing
In GCP, the HTTPS Load Balancer was fronted with a static public IP, offering low-latency global access.
In AWS, ALB uses CNAME-based routing, meaning clients resolve the ALB DNS name (e.g.,
abc-region.elb.amazonaws.com
) via Route 53 or third-party DNS.
Routing Mechanism Differences
GCP’s URL maps allowed expressive, path-based routing across services.
AWS required translating these into listener rules and target groups, often resulting in more granular configurations.
SSL/TLS Certificates
GCP handled our custom wildcard SSL certificate.
In AWS, we migrated to ACM (AWS Certificate Manager) for easier management of domain validations, renewals, and usage across ALBs and CloudFront.
Application-Specific Custom Rules
In AWS, we created custom listener rules to forward traffic based on request path or headers, similar to GCP’s URL maps.
These rules were especially useful for routing requests to internal APIs and static content.
📮 Special Case: Postfix & Port 25 Restrictions
To migrate our Postfix mail servers that use port 25 for SMTP, we had to:
Submit an explicit request to AWS Support for port 25 to be opened (outbound) on our AWS account in the specific region.
This was a prerequisite for creating a Network Load Balancer (NLB) that could pass traffic directly to the Postfix instances.
Note: AWS restricts outbound SMTP traffic on port 25 by default to prevent abuse. This is not the case in GCP, so ensure to factor this into your cutover timeline if you're migrating mail servers.
🔍 Phase 5: Apache Solr Migration
Apache Solr powered our platform's search functionality with complex indexing and fast response times. Migrating it to AWS introduced both architectural and operational complexities.
🛠️ Migration Strategy
AMI Creation Was Non-Trivial:
We created custom AMIs for Solr nodes with large EBS volumes. However, this surfaced two key challenges:Large volume AMI creation took longer than expected
Lazy loading of attached volumes in AWS meant the data wasn’t instantly accessible upon instance boot-up.
No AWS FSR:
AWS Fast Snapshot Restore (FSR) could have helped—but was ruled out due to budget constraints. Without FSR, we observed delayed volume readiness post-launch.Index Rebuild from Source DB:
Post-migration, we rebuilt Solr indexes from source data stored in MongoDB and MySQL, ensuring consistency and avoiding partial data issues.Master-Slave Architecture:
We finalized a standalone Solr master-slave setup on EC2 after a dedicated PoC. This provided better control compared to GCP's managed instance groups.
🏗️ GCP vs AWS Deployment Model
Feature | GCP MIGs | AWS EC2 Standalone |
Deployment | Solr slaves ran in Managed Instance Groups | Solr nodes deployed on standalone EC2s |
Volume Attachment | Persistent volumes mounted with boot disk | EBS volumes suffered from lazy loading, slowing boot |
Autoscaling | Fully autoscaled Solr slaves based on demand | Autoscaling impractical due to volume readiness delays |
Cost Management | On-demand scaling saved costs | Used EC2 scheduling (shutdown/startup) to control spend |
⚡ Operational Decision: No Autoscaling for Solr in AWS
In GCP, autoscaling Solr slaves was seamless—new instances booted with attached volumes and joined the cluster dynamically.
However, in AWS:
Lazy loading of EBS volumes made autoscaling unreliable for time-sensitive indexing.
Instead, we:
Kept EC2 nodes in a fixed topology
Used scheduled start/stop scripts (via cron) to manage uptime during peak/off-peak hours.
Lessons Learned
Solr migrations need deep consideration of disk behavior in AWS. If you're not using FSR, do not assume volume availability equals data availability. Factor in rebuild times, cost impact, and whether autoscaling truly benefits your workload.
🛑 The Cutover Weekend
We declared a deployment freeze 7 days before the migration to maintain stability and reduce last-minute surprises.
Pre-Cutover Checklist
TTL reduced to 60 seconds to allow quick DNS propagation.
Final S3 and database sync performed.
Checksums validated for critical data.
Route 53 routing policies configured to mimic GCP’s internal DNS.
CloudWatch, Nagios, and Grafana set up for monitoring.
Final fallback snapshot captured.
A comprehensive cutover runbook was prepared with clear task ownership and escalation paths.
🕒 Cutover Timeline
Time Slot | Task |
Hour 1 | Final S3 + DB sync |
Hour 2–3 | DB failover and validation |
Hour 4 | DNS switch from GCP to Route 53 |
Hour 5–6 | Traffic validation + rollback readiness |
😮 Unexpected Issues (and What We Did)
Problem | Solution |
MySQL master switch had lag | Improved replica promotion playbook |
Hardcoded GCP configs found | Emergency patching of ENV & redeploy |
Solr slow to boot under load | Temporarily pre-warmed EC2 nodes |
🚀 Post-Migration Optimizations
Rightsized EC2 instances using historical metrics
Committed to Savings Plans for reserved workloads
Enabled and tuned S3 lifecycle policies
Set up automated AMI rotations and DB snapshots
🧠 End of Part 2: Final Thoughts & What’s Next
This journey from GCP to AWS wasn’t just about swapping clouds—it was a masterclass in operational resilience, cross-team coordination, and cloud-native rethinking.
We learned that:
No plan survives contact without flexibility.
Owning your infrastructure also means owning your edge cases.
Migration is more than lift-and-shift—it's evolve or expire.
“Smooth seas never made a skilled sailor.”
— Franklin D. Roosevelt
This migration tested our nerves and processes, but ultimately, it left us with better observability, tighter security, and an infrastructure we could proudly call production-grade.
🔗 If this helped or resonated with you, connect with me on LinkedIn. Let’s learn and grow together.
👉 Stay tuned for more behind-the-scenes write-ups and system design breakdowns.
Subscribe to my newsletter
Read articles from Cyril Sebastian directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Cyril Sebastian
Cyril Sebastian
I’m Cyril Sebastian, a DevOps and Cloud Infrastructure architect with 10+ years of experience building, scaling, and securing cloud-native and hybrid systems. I specialize in automation, cost optimization, observability, and platform engineering across AWS, GCP, and Oracle Cloud. My passion lies in solving complex infrastructure challenges—from cloud migrations to Infrastructure as Code (IaC), and from deployment automation to scalable monitoring strategies. I blog here about: Cloud strategy and migration playbooks Real-world DevOps and automation with Terraform, Jenkins, and Ansible DevSecOps practices and security-first thinking in production Monitoring, cost optimization, and incident response at scale If you're building in the cloud, optimizing infra, or exploring DevOps culture—let’s connect and share ideas! 🔗 linkedin.com/in/sebastiancyril