Multi-Cloud Reality: A Platform Engineer's Perspective


As a Platform Engineer working across GCP and Azure, I've witnessed firsthand the evolution of cloud strategies. The concept of multi-cloud has gone from a theoretical ideal to a practical reality for many organisations. But throughout this journey, I've often questioned whether the complexity of managing multiple cloud providers delivers the promised benefits or simply creates additional overhead.
The Single Cloud Beginning
Like many organisations, we initially committed to a single cloud provider. The decision was pragmatic—focusing our team's expertise on mastering one platform seemed the most efficient path forward. We built our infrastructure on Azure, developed deep expertise in its services, and created streamlined CI/CD pipelines tailored to its specific quirks and features.
This approach served us well for some time. We benefited from volume discounts, established relationships with support teams, and developed institutional knowledge that made operations relatively smooth. Our team became increasingly proficient with Azure-specific tools and services, which accelerated our development cycles.
However, as our company expanded, we encountered limitations. Our data analysts and scientists began requesting more sophisticated analytics capabilities for processing massive datasets. They found Azure's data solutions increasingly limiting for their specific analytical workloads, particularly as our data volumes grew. Additionally, certain AI and ML services they needed were more mature on GCP than Azure at the time.
The GCP Analytics Advantage
The tipping point came when our analytics team demonstrated how GCP's suite of data tools could transform our approach to business intelligence. Several key services stood out:
BigQuery became a game-changer for our analysts. Its serverless architecture allowed them to run complex SQL queries against petabyte-scale datasets without provisioning infrastructure. The separation of storage and compute costs also aligned perfectly with our variable workload patterns, offering significant cost advantages over Azure Synapse Analytics for our use cases.
Dataflow dramatically simplified our ETL processes. Built on Apache Beam, it handled both batch and stream processing within a unified programming model, reducing the development overhead we experienced with Azure Data Factory for complex transformations.
Looker Studio (formerly Data Studio) provided our business teams with self-service analytics capabilities that integrated seamlessly with BigQuery, democratising data access across the organisation.
Dataproc offered a more streamlined approach to running our Spark jobs compared to Azure HDInsight, with faster cluster spin-up times and better integration with our growing GCP environment.
Vertex AI unified our machine learning workflow, making it easier for our data scientists to build, train, and deploy models at scale—a process that had become increasingly fragmented across different Azure services.
The Multi-Cloud Reality Check
Our multi-cloud journey began out of technical necessity rather than strategic choice. Integrating GCP's analytics capabilities with our existing Azure infrastructure became my primary focus. Initially, we operated them as separate islands, but this approach quickly proved unsustainable.
The challenges became apparent immediately:
Skills divide: Our Azure experts struggled with GCP concepts, and vice versa. The seemingly similar services often had subtly different behaviours that caused confusion.
Tooling complexity: Our existing automation tools needed significant reworking to accommodate both cloud providers. Even with Terraform's multi-cloud capabilities, we found ourselves maintaining parallel configurations.
Cost visibility: Tracking and optimising spending across platforms required new processes and tools that we hadn't anticipated.
Security consistency: Ensuring consistent security controls and compliance across environments became a significant challenge, requiring additional governance frameworks.
Despite these hurdles, we gradually developed a functional multi-cloud approach. The most immediate benefit was the ability to place workloads where they made the most sense—both technically and commercially. Our data and analytics workloads flourished on GCP, while many of our enterprise applications remained on Azure where they continued to benefit from tight integration with Microsoft's ecosystem.
Finding the Right Balance
Our multi-cloud strategy evolved beyond simply maintaining separate environments. Instead, we developed a layered approach:
Core Infrastructure Layer: We standardised networking configurations, identity management, and monitoring across both clouds. This created a consistent foundation that made cross-cloud operations more predictable.
Platform Services Layer: For databases, storage, and compute, we leaned into cloud-native services but created abstraction layers that allowed our applications to be somewhat portable.
Application Layer: New applications were designed with potential portability in mind, but not at the expense of leveraging valuable cloud-specific features when appropriate.
Data and Analytics Layer: This became primarily GCP-focused, taking full advantage of Google's data processing capabilities while maintaining secure data pathways to and from our Azure environments.
This balanced approach has given us flexibility without excessive complexity. Rather than trying to achieve complete portability (which often leads to using the lowest common denominator of features), we've accepted that some workloads will be tightly coupled to a specific cloud while ensuring our overall architecture remains adaptable.
Lessons from the Field
The multi-cloud journey has been enlightening, yielding several key insights:
What Works Well
Leveraging unique strengths: Each cloud has services where it clearly excels. Azure's integration with Microsoft products and identity services paired with GCP's superior data processing and analytics capabilities both provide distinct advantages.
Negotiating power: Having established presence across providers has given us leverage in contract negotiations.
Fault isolation: Critical systems distributed across clouds have demonstrated better resilience during provider-specific outages.
Talent attraction: Our multi-cloud environment has proven attractive to engineers looking to broaden their cloud expertise.
What Doesn't Work
Trying to abstract everything: Our early attempts to create provider-agnostic abstractions for every service led to overly complex solutions that limited functionality.
Assuming similar services are identical: Similar-sounding services often have fundamentally different architectures and constraints.
Underestimating the operational burden: The cognitive load on teams managing multiple cloud environments is substantial and requires deliberate knowledge management strategies.
Expecting immediate cost savings: The overhead of multi-cloud often outweighs any negotiated discounts in the short term.
Is Multi-Cloud Worth It?
After years of managing a multi-cloud environment, I've come to believe that multi-cloud is neither a panacea nor a mistake—it's a strategic choice with specific trade-offs.
For organisations with these characteristics, multi-cloud is likely beneficial:
Diverse technical requirements that align with the strengths of different providers (like our need for both Microsoft ecosystem integration and Google's analytics capabilities)
Strong regulatory or client-driven needs for particular cloud environments
Mature DevOps practices and automation capabilities
The scale to justify the additional operational complexity
Conversely, organisations might want to avoid multi-cloud if:
They lack the engineering resources to develop expertise across platforms
Their workloads are relatively homogeneous
They don't have the automation maturity to manage environments consistently
The benefits don't outweigh the additional operational costs
Conclusion: Pragmatism Over Dogma
Multi-cloud architecture isn't inherently good or bad—it's a strategic choice that must align with business objectives and organisational capabilities. In our case, the initial complexity has been worth the flexibility and resilience we've gained. More importantly, it's enabled our data analysts and scientists to leverage GCP's superior analytics tools while maintaining our existing Azure investments.
The key to our success has been embracing a pragmatic approach rather than dogmatically pursuing either complete cloud independence or single-cloud purity. We use each cloud where it makes sense—Azure for its enterprise application ecosystem and GCP for its data processing prowess—create consistency where possible, and accept difference where necessary.
For those considering a multi-cloud strategy, I'd recommend starting with clear business objectives rather than technical ideals. Understand what problems you're trying to solve and whether multi-cloud is the most effective solution. Build your strategy incrementally, focusing first on core infrastructure and gradually expanding as your capabilities mature.
The future of cloud computing isn't likely to be single-cloud or multi-cloud for everyone—it's about finding the right mix of services and providers that enables your organisation to achieve its goals efficiently and reliably. In that sense, multi-cloud isn't the destination; it's one possible path on a much longer technological journey.
Subscribe to my newsletter
Read articles from Joby directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
