Sovereign Cloud: A Fear-Based Promise Without a Data-Based Plan ?

Intro

Sovereign cloud is suddenly everywhere. It promises protection, control, and compliance especially for data in critical sectors. But is it a technical solution, or just a comforting label?

In this post, I explore what makes sovereign cloud attractive, when it actually helps, and when it may distract us from doing the real work: understanding and classifying our data.

To start it off: This is my personal opinion, based on my own observations and experience.

Lately, I’ve been seeing a flood of articles about sovereign cloud, and how hyperscalers can’t truly guarantee data safety. Microsoft, for example, openly admits it cannot prevent U.S. authorities from accessing your data, even when it is stored in Europe.

This hit close to home as I’ve been updating our architecture principles (still pending approval), where cloud and AI are key themes. And while writing them, I kept circling back to the same realization:

It is not about where your data sits. It is about what you are trying to achieve with it. The value of your data, and the intent behind using it, should drive your architectural choices. Location alone does not guarantee control.

Instead of chasing a "sovereign" label or falling for one-size-fits-all solutions, we should go back to basics: Start with the data.

That brings us to the practical side of the discussion. Rather than debating abstract notions of sovereignty, it helps to look at how cloud is actually used in real-world scenarios and what that means for data control in practice.

But first, a quick note on data and classification: Not all data is equal. Some data is public or low-risk. Other data is highly regulated or inherently sensitive, such as patient records, legal documents, or telemetry from critical infrastructure.

How you classify your data determines what levels of protection, sovereignty, and control you actually need. That classification should be the starting point when deciding whether the public cloud is even an option.

Unfortunately, this step is often skipped or poorly implemented. That is ironic, because in my humble opinion, it is probably the most important step of all.

The following use cases are written under the assumption that data is stored within EU-based data centers or under European jurisdiction. This setup provides, to some extent, a first layer of protection through local laws such as the GDPR.

To explore this further, let’s look at some common use cases for storing data in cloud environments:

• BLOB Storage
• Temporary AI Workloads
• Data lake
• On-demand scaling

These use cases make perfect sense. And in some cases, they are genuinely hard to realize safely within the current ecosystem, especially if your constraints are strict and local.

I’m writing this from a hospital perspective, where many processes are still very much tied to the physical location of the hospital itself. That naturally creates a narrower frame of reference. One where physical presence, data locality, and regulatory oversight are deeply intertwined.

BLOB Storage

This is one of the most common cloud storage scenarios. It is often chosen as a cost-effective way to store inactive data that must be retained for legal or administrative reasons, even when it is rarely accessed.

In practice, this includes project documentation, procurement archives, tax records, contract files, and audit trails. Data that is not operationally critical, but legally important.

In the Netherlands and across the EU, this type of data often falls under:

• Dutch Tax Law, which requires a minimum retention period of 7 years (up to 10 for real estate)
• EU Public Procurement Directives, which require contract records and award documents to be preserved for multiple years.
• Archiving requirements for subsidised projects, especially in research, education, or EU-funded programs

This use case can often be safely implemented in hyperscale cloud environments, if done correctly. For example:

• Encrypt files before upload
• Use long-term object storage (such as Azure Archive or Amazon Glacier Deep Archive)
• Store encryption keys outside the cloud provider using a Customer Managed Key system
• Document metadata and classification externally, rather than relying on cloud-native indexing
• Ensure you can meet chain-of-custody and eDiscovery requirements, if needed later

By preparing and encrypting files locally before upload, and applying an additional layer of encryption at rest with your own keys, you create a setup where cost and compliance can go hand in hand. You maintain control. The provider stores only encrypted blobs.

This makes hyperscale cloud usable even for data subject to strict public procurement rules or tax audits.

Of course, this approach does add some administrative overhead. You need to maintain your own archive catalog and ensure that your classification and metadata remain accessible independently of the cloud platform. But for long-term, low-access archives, the trade-off is often acceptable and significantly cheaper than managing on-premises storage infrastructure for 10 or more years.

Temporary AI Workloads

This use case really calls for a different mindset. AI models need a lake of data. That is a sneak preview of the next use case, and no, I am not sorry.

But seriously, the key here is data classification. What kind of data are you using? How sensitive is it? What happens if it is lost or exposed? And if it is critical, can you reduce the risk?

For example, by implementing pseudonymisation and keeping your own reversal tables, you can significantly lower the exposure risk. That way, even if the data is accessed, it does not directly identify individuals. Under GDPR (AVG), this can make a major difference in both compliance and liability.

Yes, this requires extra steps. But if you are working with a volatile or competitive workload that demands burst GPU or CPU capacity, it may be worth it. You gain access to scalable cloud resources without compromising too much on data protection, assuming the pseudonymisation is done properly and is reversible only under your control.

Another approach is to change how you structure the workload. If your AI model supports it, you can stream data directly from your source systems, cache it temporarily, and dispose of it immediately after processing. This decreases the time window in which external access could occur.

In both cases, the underlying principle remains the same. Data classification and risk profiling are everything. And yes, this is the last time I will say it. Cause it applies to every use case in this blog.

Summary of Protection Mechanisms for Temporary AI Workloads

• Classify your data upfront: determine sensitivity, risk, and purpose before processing
• Pseudonymise sensitive data: remove direct identifiers and manage reversal tables internally
• Ensure reversibility remains in-house: only your organization should be able to re-identify data
• Stream and discard data quickly: minimize retention by caching temporarily and disposing post-processing
• Minimize cloud storage footprint: less stored data means reduced exposure and liability
• Base all decisions on a documented risk profile, not just available tooling

Data lake

This is arguably the most difficult use case from a data control and compliance standpoint. When you aim for real-time insights based on live operational data, trust in your storage and processing environment becomes essential.

At first glance, using Customer Managed Keys (CMKs) seems like a reasonable safeguard. You hold the keys, and the provider stores encrypted data. But this setup still leaves you operating inside a shared infrastructure. The provider controls the hardware, telemetry, logging, and orchestration layers.

To make things harder, many of the best-in-class data lake tools for ingestion, transformation, classification, and querying are only available inside these same hyperscaler platforms. By choosing those tools, you are implicitly accepting that the data must live within their ecosystem.

This directly impacts several architectural principles I am currently working on, including:

• Cloud services must never disrupt primary processes during outages
• Cloud platforms processing confidential data must offer end-to-end encryption and explicit Zero Trust access controls
• Sensitive information must be periodically synchronized to local backup environments under the organization’s control
• Each cloud application must be explicitly categorized (SaaS, PaaS, IaaS) with a clear governance model
• Vendors must actively support audits on compliance, security, and data processing practices

These principles help structure your cloud approach and supplier expectations. They provide a basis for strong agreements with your providers. But even with well-crafted contracts and controls in place, there is no such thing as a 100 percent guarantee.

In theory, a sovereign cloud could help address these concerns. But only if it can offer the same level of service, tooling, and third-party integration as the current hyperscalers. That means building a comparable ecosystem, not just compliant storage, but also advanced analytics, AI pipelines, and orchestration tools with full local governance.

Realistically, that level of maturity may take years to materialize. Until then, the only real option is to think critically about how you use cloud services today, and whether your design aligns with the value and risk profile of your data.

Summary of Protection Mechanisms for Data Lake Architectures

• Use Customer Managed Keys (CMKs) to retain control over data encryption
• Recognize that shared infrastructure still grants the provider access to hardware and telemetry layers
• Accept that choosing native cloud tools implies vendor lock-in at the data level
• Define strict architectural principles including:

No disruption to critical processes
End-to-end encryption and Zero Trust for sensitive data
Regular synchronization of sensitive data to local backup environments
Clear governance based on SaaS/PaaS/IaaS classification
Mandatory vendor cooperation in compliance and security audits
• Build supplier contracts around these principles, but acknowledge that 100% guarantees do not exist
• Evaluate if sovereign cloud can offer equivalent tooling and ecosystem maturity before assuming it is a safer option
• Make risk-based decisions today, regardless of future promises from sovereign offerings

On-demand scaling

Strictly speaking, this is not a standalone use case. It is a capability that supports nearly all of the previous examples, along with many others not mentioned here.

Whether you are training AI models, archiving regulatory data, or building real-time analytics pipelines, the ability to scale infrastructure without investing in physical hardware remains one of the cloud's most attractive features.

But just because it is convenient does not mean it is always the right choice. Scalability does not eliminate the need for proper data classification, governance, or architectural boundaries. If anything, it makes those things even more important. Because if you get it wrong, you are scaling your risk just as quickly as your compute.

Conclusion: Back to the Title

To me, the whole discussion around sovereign cloud feels like a promise shaped by fear. In some cases, that fear is justified. In many others, the risk can be mitigated through good design, governance, and awareness.

What worries me is how sovereign cloud is pitched as a solution to fear, rather than as a response to an actual data-driven risk model. Even more concerning is how the conversation seems to shift away from critical thinking.

We are being presented with platforms positioned as “safe by default,” which creates the illusion that hard choices no longer need to be made. That does not sit right.

A platform should never replace proper data classification, ownership, or risk-based decision-making. Fear may motivate action, but sovereignty should be earned, not assumed.

That’s why it’s worth repeating the core principles that apply across all use cases discussed:

• Start with data classification
• Understand the value and sensitivity of your data
• Apply technical safeguards like encryption, pseudonymisation, and access controls
• Avoid blind trust, even in platforms labeled "sovereign"
• Design for resilience and reversibility, especially in critical environments
• Define your risk model first, then choose your solution
• Remember: you remain responsible. Platforms can support, but not replace, ownership

One more critical note.

We often focus on the question: what if someone gains access to my data? But there is a second, more fearsome risk that is rarely discussed.

If someone has the power to access your data, they may also have the power to block your access to it. Imagine not being able to retrieve your compliance records during an audit. Imagine losing access to the AI model that still processes live production data.

That is not just a technical inconvenience. That is operational paralysis.

And this risk exists regardless of whether you choose a foreign or local platform. Once you depend on a third party, you inherently accept the risk that access can be revoked, delayed, or disrupted, whether due to legal action, policy changes, or even technical failure.

This, too, should be part of your risk model. Not just who can see your data, but who can stop you from seeing it.

Disclaimer

This piece is written to inform and hopefully inspire a different perspective. It is based on my personal thoughts and feelings about the topic. Some concepts may be oversimplified, and I do not claim to be a field expert.

Still, I hope it provides some food for thought, or at the very least, a reason to challenge your own assumptions.

For this article, I have intentionally approached the topic from a purely data-centric perspective. That said, I am personally in favor of sovereign cloud. Not only to stimulate competition in the market, but also as a strategic measure to protect against geopolitical risks and reduce dependency on dominant global tech providers.

There are valid reasons to pursue sovereignty. But even then, it starts with understanding your data and your risk model.

Hope you liked seeing a bit into my head.
Cheers,
Mark

Sovereign Cloud: A Fear-Based Promise Without a Data-Based Plan ?