AI & Big Data: Modern Colonialism?

Our understanding of colonialism often centers on historical narratives of physical conquest and territorial expansion. However, as technology advances, a new, insidious form of colonialism has emerged: data colonialism. This isn't about physical borders, but about the control and exploitation of our digital lives. In my TEDxLanangAve talk I gave last August 21, 2024 in Davao City, I explored this evolving landscape, discussing the ways in which our digital footprint is being harvested and the potential consequences.

The concept of "data colonialism," as coined by Ulysses Meyers and Nick Caldry in their book "Data Grab," mirrors the four stages of historical colonialism: explore, expand, exploit, and exterminate. While the methods have changed, the underlying power dynamic remains eerily similar. Big tech companies, often operating under the guise of convenience and connection, amass vast quantities of personal data – our digital lives – and use this data for profit, often without our full understanding or consent.

My TEDx talk highlighted several crucial concerns. The narratives used to justify data collection – that everyone wants an easier life, that it's the only way to connect, and that AI is superior – often overshadow the ethical complexities involved. The overhyping of AI capabilities and the potential for algorithmic bias, leading to discrimination in various areas like recruitment and law enforcement, were also addressed. The environmental cost of data centers and the exploitation of low-wage workers in developing countries labeling massive datasets are further troubling aspects.

While we can't simply abandon technology, we can actively resist data colonialism. My talk proposed several actionable steps: lowering unrealistic expectations of AI, challenging dominant narratives, taking regular social media detoxes, supporting open-source AI initiatives, and scrutinizing the actions of big tech companies. The future hinges on our ability to engage in thoughtful dialogue, develop creative solutions, and build a more equitable and sustainable digital world.

The full video and the full original transcript of my TEDx talk are available below.

Full Video

https://www.youtube.com/watch?v=8mi3eyW-juU

Original Speech Transcript

June 12, 1898: We were supposed to cease being a colony of a foreign country. But instead, a new emerging empire replaced an old crumbling one.

July 4, 1946: The US finally recognized our independence.

Did colonialism end that day? Maybe. But people like UP Professor Eduardo Tadem argue, resources are being siphoned from provincial areas like Mindanao for the benefit of Metro Manila, in what’s called “Internal Colonialism”. As the new world order evolved, so did colonialism.

And it's still evolving. With breakthroughs in web, mobile, IoT, and AI, a new form of colonialism has emerged. In the book “The Costs of Connection”, authors Ulises Mejias and Nick Couldry called this “Data Colonialism”.

To understand this, we need to use the Four X Colonial Formula they introduced in their other book “Data Grab”. These are Explore, Expand, Exploit, Exterminate.

Historical Colonialism explored new land to grab; built and expanded colonies on these conquered lands; exploited their resources to enrich their empires; and exterminated using economic monopoly, oppression, and mass murder.

Meanwhile, Data Colonialism explores virtual human lives to control; expands by building digital colonies around them; exploits through large scale data extraction from these colonies, turning them into profits for Big Tech; and exterminates through monopolistic and anti-competitive behavior, and destruction of social values and culture. Well, at least data colonialism doesn’t need to resort to violence anymore, since we’re already providing our consent by agreeing to Terms & Conditions that NO ONE READS.

When I hear of colonies, I think of walled cities like Intramuros in Manila and Fort San Pedro in Cebu. Or vast haciendas of rice, sugar, and tobacco tilled by underpaid farmers. On the contrary, data colonies are more personal, and always with us, unbounded by land borders. They range from smartphones to laptops to smart home devices to software that run on them.

But the idea remains the same, a data colony is designed to efficiently extract data from the digital territory to build wealth and power for the elite few, who claim the data is just “there and free for the taking”.

Weirdly, colonizers have a need to justify their actions. While historical colonialism claimed it had the legal, religious, and scientific obligation to subjugate, exploit, and oppress, data colonialism has a set of three narratives, as observed by Mejias and Couldry:

First, everyone wants an easier life. Second, this is how we connect. Lastly, AI is smarter than humans.

It is in our best interest to not accept these narratives at face value, no matter how true they may seem.

Nvidia CEO Jensen Huang said that children should not learn how to code anymore. It’s a dangerous statement, as I believe coding and computer science will always be relevant. Note that Tech CEOs like him have the most to gain from the “success” of AI. After all, Nvidia is the world’s leading manufacturer of AI chips.

Sam Altman, CEO of the company behind ChatGPT, was quoted in the book “Our AI Journey”, saying that AI will replace 95% of marketing and creative work. We must take that with a grain of salt, since CEOs need to sell their company’s value so that more investments and consumer interest will flow in, or in other words “marketing”.

Worse, some companies overhype and even fake presentations of their AI products. Google received backlash for their faked Gemini presentation last December 2023. Then in March 2024, startup Cognition Labs unveiled Devin AI, claiming that it's the world’s first AI software engineer. This caused fear among software developers, only for the company to admit that Devin is “new and far from perfect”, and that the demo video was just a “technical preview”. A month later, Crunchbase reported that they raised $175M in funding at a whopping $2B valuation.

I’ve always thought that as long as I benefit from apps and devices, why should I care about my data? Turns out, I really should.

Social media algorithms cause filter bubbles, the digital equivalent of echo chambers. This was evident in the few recent Philippine elections, and the spread of disinformation. And need I remind you of Cambridge Analytica?

The book “Naked Statistics” emphasized how context matters almost as much as the data itself. But AI poses the risk of extracting patterns from out-of-context knowledge. This is bad because the world from which its data is collected is full of bias and prejudice. Without context, especially if left unchecked, it may lead to dire consequences like algorithmic discrimination.

In 2018, for instance, Amazon abandoned a secret AI recruiting tool because of its bias against women. In 2020, a man was arrested in Detroit because he was wrongfully identified by facial recognition software as a robbery suspect. Let’s not even go into China’s plan to export its AI surveillance state.

If those aren't alarming, how about AI’s effect on the environment?

Goldman Sachs analysts found that one ChatGPT query consumes as much energy as an hour of a 5-watt LED light. According to Boston Consulting Group, data center demand in the US alone is expected to grow to 130GW in 2030. That’s eight times the peak demand of the whole Philippines in 2023, recorded by NGCP at 17GW.

In 2019, University of Massachusetts researchers found that training an AI model can emit around 5 times the carbon emissions of a car in its whole lifetime. Since the study, these AI large language models have grown to insane levels, giving birth to chatbots like ChatGPT and Gemini.

Additionally, a study by Universities of California Riverside and Texas Arlington claims that AI will withdraw 6.6B cubic meters of water yearly by 2027 just to cool down Data Centers, where AI is trained. Training ChatGPT alone consumes around 700 thousand liters of water. Every 10-50 ChatGPT prompts consume an equivalent of a 16 ounce bottle of water.

Meanwhile, workers from emerging economies are being paid as low as 20 pesos an hour to label huge amounts of data for services like Amazon’s Mechanical Turk. AI ethicist Dominic Ligot calls these “digital sweatshops”. Interesting choice for a name, by the way, since a mechanical turk is an 18th century chess-playing robot, which was proven to be a hoax.

University of Sheffield’s Clive Humby famously said that “Data is the New Oil.” Why then are we giving it away for free? But how can we resist Data Colonialism? Should we quit social media or throw our phones away? Of course not.

This is a complex problem. No one has all the answers. We must rely on our ability to come up with creative and novel solutions. But these won’t come out unless we start having these conversations. That said, let me suggest some ways to fight back.

First, we must tone down our expectations of AI. We fear what we don’t understand. If you can, try to understand the algorithms behind them. Allow me to offer my oversimplification: AI is a superpowered statistical prediction tool.

Second, we must realize that the West and Big Tech don’t have a monopoly on ideas on how we should live our lives. We all have a stake in forming the narratives.

Third, occasionally go on a social media detox.

Fourth, support open source data and AI initiatives.

Fifth, let’s make a habit of scrutinizing Big Tech. Between Facebook obsessing with the Metaverse and Smart Glasses, Google killing its plans to remove tracking cookies from Chrome, Microsoft’s AI-powered search tool tracking user activity on Windows including passwords, and Elon Musk’s AI model training on public tweets... Do we really buy the narrative that these are all for our own good? Or are these new data colonies they're forcing on us?

Lastly, let’s demand for better cybersecurity and privacy policies.

Despite all I've said, I still have high hopes for responsible and sustainable AI. There is a lot of ongoing research on the use of AI for detecting cancer, battling wildfires, preventing deforestation, and saving endangered animals. There’s plenty of good that AI can bring into the world, but generating creepy AI art is not it. Destroying our environment is not it. Widening global inequality and inequity is not it.

Ah, before I end, I just want to say that… No, this talk is not AI-generated.

Thank you, TEDx. And daghang salamat, Davao!

Colonialism in the Age of AI and Big Data

Full Video

Original Speech Transcript

Subscribe to my newsletter

Kuya Dev (Rem Lampa)

Kuya Dev (Rem Lampa)