GCP to AWS Migration β Part 1: Architecture, Data Transfer & Infrastructure Setup

Table of contents
- π§ Why We Migrated: Business Drivers Behind the Move
- π Step 0: Creating a Migration Runbook
- π οΈ Infrastructure Mapping: GCP vs AWS
- π Phase 1: AWS Network Infrastructure Setup
- π¦ Phase 2: Data Migration (GCS β S3)
- ποΈ Phase 3: Database Migration
- End of Part 1: Setting the Stage for Migration

π§ Why We Migrated: Business Drivers Behind the Move
Our platform, serving millions of daily users, was running smoothly on GCP. However, evolving business goals, pricing considerations, and long-term cloud ecosystem alignment led us to migrate to AWS.
Key components of our GCP-based stack:
Web Tier: Next.js frontend + Django backend
Databases: MongoDB replica sets, MySQL clusters
Asynchronous Services: Redis, RabbitMQ
Search: Apache Solr for full-text search
Infrastructure: GCP Compute Engine VMs, managed instance groups, HTTPS Load Balancer
Storage: 21 TB of data in Google Cloud Storage (GCS)
π Step 0: Creating a Migration Runbook
We treated this as a mission-critical project. Our runbook included:
Stakeholders: CTO, DevOps Lead, Database Architect, Application Owners
Timeline: 8 weeks from planning to cutover
Phases: Network Setup β Data Migration β Database Sync β Application Migration β Cutover
Rollback Plan: Prepared and rehearsed with timelines for failback
π οΈ Infrastructure Mapping: GCP vs AWS
Challenges in Mapping:
GCP allows custom CPU and RAM, AWS uses fixed instance types (t3, m6i, r6g, etc)
IOPS differences between GCP SSDs and AWS EBS gp3 required tuning
Cost model varies significantly (especially egress from GCP)
We used AWS Pricing Calculator and GCP Pricing Calculator to simulate monthly billing and select cost-optimized instance types.
π Phase 1: AWS Network Infrastructure Setup
AWS Network Infrastructure (ap-south-1)
ββββββββββββββββββββββββ
β GCP / DC VPC β
ββββββββββ¬ββββββββββββββ
β
βββββββββββΌββββββββββ
β Site-to-Site VPN β
βββββββββββ¬ββββββββββ
β
βββββββββββΌββββββββββ
β VPC β
β (ap-south-1) β
βββββββββββ¬ββββββββββ
β
ββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββ
β β β
βββββββββββΌββββββββββ ββββββββββββΌβββββββββββ ββββββββββββΌβββββββββββ
β Public Subnet AZ1 β β Public Subnet AZ2 β β Public Subnet AZ3 β
β - Bastion Host β β - NAT Gateway β β - Internet Gateway β
βββββββββββ¬ββββββββββ ββββββββββββ¬βββββββββββ ββββββββββββ¬βββββββββββ
β β β
βΌ βΌ βΌ
ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ
βPrivate Subnet 1β βPrivate Subnet 2β βPrivate Subnet 3β
βApp / DB Tier β βApp / DB Tier β βApp / DB Tier β
ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ
Security Groups + NACLs as per GCP mapping
VPC Flow Logs β CloudWatch Logs
π§© Components Breakdown
Component | Purpose |
3 AZs | High availability and fault tolerance |
Public Subnets | Bastion, NAT, IGW for ingress/egress |
Private Subnets | Isolated app and DB tiers |
VPN | Secure hybrid GCPβAWS connectivity |
Security | Security Groups + NACLs derived from GCP firewall rules |
Monitoring | VPC Flow Logs + CloudWatch Metrics for visibility |
π¦ Phase 2: Data Migration (GCS β S3)
We migrated over 21 TB of user-generated and application asset data from Google Cloud Storage (GCS) to Amazon S3. Given the scale, this phase required surgical precision in planning, execution, and cost control.
Tools & Techniques Used
AWS DataSync:
Chosen for its efficiency, security, and ability to handle large-scale object transfers.Service Account HMAC Credentials:
Used for secure bucket-to-bucket authentication between GCP and AWS.Phased Sync Strategy:
Initial Full Sync β Began 3 weeks before cutover
Delta Syncs β Repeated every 2β3 days
Final Cutover Sync β During the 6-hour cutover window
Note: We carefully validated checksums and object counts after each sync phase to ensure data integrity and avoid overwriting unchanged files.
π‘ Smart Optimization Decisions
Selective Data Migration:
Identified several GCS buckets containing temporary compliance logs with auto-expiry policies.
Instead of migrating them and incurring egress charges, we chose to let them expire in GCP.
This alone saved several thousand dollars in unnecessary transfer costs.
Delta Awareness:
Designed the sync jobs to be delta-aware to prevent redundant data movement.
Ensured that only modified/new objects were transferred during delta and final syncs.
Post-Migration S3 Tuning
After the bulk migration was completed, we fine-tuned our S3 environment for cost optimization, data hygiene, and long-term sustainability.
Lifecycle Policies Implemented:
Automatic archival of infrequently accessed data to S3 Glacier.
Expiry rules for:
Temporary staging files.
Orphaned or abandoned data older than a defined threshold.
Configured S3 Incomplete Multipart Upload Aborts:
- Any incomplete uploads are now automatically aborted after 7 days, preventing unnecessary storage billing from partial uploads caused by network or user errors.
Lessons Learned
Data Volume β Data Complexity:
Even though we had tools for the job, coordinating syncs across staging, pre-prod, and production environments required careful orchestration and monitoring.Egress and DTO Costs:
Data Transfer Out from GCP was a hidden but substantial cost centerβplan ahead for this when budgeting.S3 Behavior Is Not GCS:
We had to adjust application logic and IAM policies post-migration to align with S3 object handling, access policies, and permissions model.
ποΈ Phase 3: Database Migration
MongoDB Migration
Migrating MongoDB from GCP to AWS was one of the most sensitive components of the move due to its role in powering real-time operations and user sessions.
Our Strategy:
Replica Set Initialization: Set up MongoDB replica sets on AWS EC2 instances to mirror the topology running in GCP.
Oplog-Based Sync: Enabled oplog-based replication between AWS and GCP MongoDB nodes to ensure near real-time data synchronization without full data dumps.
Hybrid Node Integration: Deployed a MongoDB node in AWS, directly connected to the GCP replica set, acting as a bridge before full cutover.
iptables for Controlled Access: Used iptables rules to restrict write access during the sync period. This allowed inter-DB synchronization traffic only, blocking application-level writes and ensuring data consistency before switchover.
Failover Testing: Conducted multiple failover and promotion drills to validate readiness, with rollback plans in place.
Key Takeaway: Setting up a hybrid node and controlling access at the OS level allowed us to minimize data drift and test production-grade failovers without service disruption.
MySQL Migration
The MySQL component required careful orchestration to ensure transactional consistency and minimal downtime.
Our Approach:
Master-Slave Topology: Established a classic master-slave setup on AWS EC2 instances to replicate data from the GCP-hosted MySQL master.
Replication Lag Challenges: One of the major blockers encountered was replication lag during promotion drills, especially under active write-heavy workloads.
Controlled Write Freeze: We implemented iptables-based rules at the OS level to block application write traffic, allowing replication to catch up safely before cutover.
Promotion Strategy:
Executed a time-based cutover window.
Promoted the AWS slave node to master using a custom validation script to check replication offsets and ensure data integrity.
All secondary nodes were reconfigured to follow the new AWS master, ensuring consistency across the cluster.
Key Takeaway: Blocking writes via iptables provided a clean buffer for promotion without the risk of in-flight transactions, making the cutover smooth and predictable.
End of Part 1: Setting the Stage for Migration
Youβve seen how we architected an AWS environment from scratch, replicated critical systems like MongoDB and MySQL, and seamlessly migrated over 21 TB of assets from GCP to S3βall while optimizing for cost, security, and scalability.
But this was just the calm before the storm.
"Give me six hours to chop down a tree and I will spend the first four sharpening the axe."
β Abraham Lincoln
We were well-prepared. But would the systemsβand the teamβhold up during live cutover?
In Part 2: The Real Cutover & Beyond, weβll step into the fire:
What went wrong,
What we had to patch live,
And what we did to walk away from it stronger.
π Don't miss it. Follow me on LinkedIn for more deep-dive case studies and real-world DevOps/CloudOps stories like this.
Subscribe to my newsletter
Read articles from Cyril Sebastian directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Cyril Sebastian
Cyril Sebastian
Iβm Cyril Sebastian, a DevOps and Cloud Infrastructure architect with 10+ years of experience building, scaling, and securing cloud-native and hybrid systems. I specialize in automation, cost optimization, observability, and platform engineering across AWS, GCP, and Oracle Cloud. My passion lies in solving complex infrastructure challengesβfrom cloud migrations to Infrastructure as Code (IaC), and from deployment automation to scalable monitoring strategies. I blog here about: Cloud strategy and migration playbooks Real-world DevOps and automation with Terraform, Jenkins, and Ansible DevSecOps practices and security-first thinking in production Monitoring, cost optimization, and incident response at scale If you're building in the cloud, optimizing infra, or exploring DevOps cultureβletβs connect and share ideas! π linkedin.com/in/sebastiancyril