The Hidden Game: How AI Giants Use Free Offerings to Harvest Your Digital Soul

Aroop RathAroop Rath
8 min read

The Great AI Data Mining Deception

In July 2025, a seemingly generous announcement sent ripples through India's tech landscape. Bharti Airtel, the country's second-largest telecom giant, partnered with Perplexity AI to offer 360 million users a free 12-month subscription to Perplexity Pro, normally valued at ₹17,000 per year. Almost simultaneously, Google unveiled its own "gift" to students worldwide free access to Gemini Advanced AI, complete with premium features worth thousands of dollars annually.

These announcements were celebrated as democratizing AI access, empowering millions with cutting-edge technology. But behind the glossy press releases and feel-good narratives lies a far more calculated strategy: the systematic harvesting of human behavior data to strengthen AI models at an unprecedented scale.

This isn't just about free services, it's about data colonization in the digital age.

The Perplexity-Airtel Alliance: A Trojan Horse Strategy

When Perplexity CEO Aravind Srinivas announced the Airtel partnership, he framed it as bringing "cutting-edge AI in the hands of millions". The reality tells a different story. Within hours of the announcement, Perplexity shot to the top of India's Apple App Store, overtaking even ChatGPT. This wasn't just success – it was a user acquisition masterclass disguised as corporate generosity.

Airtel's 360 million customers across mobile, broadband, and DTH services. To put this in perspective, that's roughly one-quarter of India's entire population suddenly having access to an AI platform that records every query, interaction, and behavioral pattern.

The true cost? Each user becomes a data generator, feeding Perplexity's AI models with:

  • Search patterns and preferences

  • Language nuances and regional variations

  • Problem-solving approaches and thought processes

  • Professional and personal information needs

  • Time-based usage patterns and habits

As one industry observer noted, "This partnership is an exciting way to make accurate, trustworthy, and professional grade AI accessible to more people in India... with Perplexity Pro, users get a smarter, easier way to find information". But they also become unwitting training data contributors for one of the world's most sophisticated AI systems.

The Data Collection Reality

Behind Perplexity's sleek interface lies a comprehensive data collection apparatus. According to recent analysis, Perplexity AI collects 36% of user data input into the platform. While this might seem modest compared to Google Assistant's 86% or Amazon Alexa's 93%, the quality and depth of data collected through conversational AI interactions are exponentially more valuable than simple voice commands.

Every question asked, every follow-up query, every document uploaded through the Pro features becomes training fuel for Perplexity's models. The platform's Deep Research feature, available free to Airtel users, encourages users to engage in complex, multi-step research processes, generating rich behavioral datasets that would cost millions to acquire through traditional means.

Google's Student Strategy: The Long Game

Google's approach with Gemini for Students reveals an even more sophisticated data harvesting strategy. By targeting students aged 18 and above, Google isn't just acquiring users, it's shaping the AI native generation.

The Educational Trojan Horse

The student offering includes:

  • Free access to Gemini 2.5 Pro (Google's most advanced model)

  • NotebookLM Plus with enhanced capabilities

  • 2TB of cloud storage across Google services

  • Integration with Google Workspace for seamless productivity

But here's the hidden mechanism: every essay written, every research query, every creative project becomes training data for Google's AI systems. Students receive "free" AI assistance while unknowingly contributing to the largest behavioral data collection experiment in educational history.

The Scale of Student Data Mining

In India alone, Google's student program targets millions of college students until September 15, 2025. A Google-Kantar study revealed that 95% of Indian students using Gemini feel more confident in their daily lives, while 75% of Indians seek AI collaboration tools for personal growth.

This isn't just user satisfaction, it's dependency creation. Students become accustomed to AI assistance for:

  • Academic research and writing

  • Career preparation and interviews

  • Creative projects and presentations

  • Problem-solving and critical thinking

Each interaction trains future AI models while creating a generation psychologically dependent on Google's AI ecosystem.

The Data Mining Industrial Complex

How AI Models Learn from Human Behavior

Modern AI systems require massive datasets to achieve human-like performance. According to IBM's training data requirements, AI models need extensive and diverse datasets with thousands to millions of examples to learn patterns effectively. But not all data is created equal.

Conversational AI interactions provide uniquely valuable training data because they capture:

  • Natural language patterns in real-world contexts

  • Problem-solving methodologies across different domains

  • Cultural and linguistic nuances specific to regions

  • Behavioral preferences and decision-making processes

  • Temporal patterns of human information needs

The Economics of "Free"

The global data mining tools market was valued at $591.2 million in 2018 and is projected to reach $1.21 billion by 2025. But these figures pale compared to the value extracted from behavioral AI training data.

Recent data licensing deals reveal the true worth:

  • Reddit-OpenAI partnership: $203 million in data licensing revenue

  • Apple-Shutterstock deal: $25-50 million for image training data

  • Google-Stack Overflow partnership: Undisclosed millions for programming content

The Perplexity-Airtel and Google-Students deals dwarf these figures in potential value, providing access to hundreds of millions of users generating real-time behavioral data across diverse demographics and use cases.

The Freemium Data Trap

The freemium model has become the perfect vehicle for mass data collection. Research shows that companies with freemium models see up to 50% lower customer acquisition costs, but the real savings come from data acquisition efficiency.

Traditional data collection methods include:

  • Web scraping and tracking

  • Crowdsourcing and surveys

  • Sensor data and IoT devices

  • Social media monitoring

  • Purchase behavior analysis

But freemium AI services provide something far more valuable: voluntary, detailed, and continuous behavioral data from users who believe they're receiving value in exchange.

The Hidden Mechanisms of Data Exploitation

Beyond Simple Tracking

Modern AI platforms employ sophisticated data collection mechanisms that go far beyond traditional tracking:

Behavioral Profiling: AI systems analyze interaction patterns, response times, query complexity, and follow-up behaviors to build comprehensive user profiles.

Sentiment Analysis: Every conversation is analyzed for emotional states, satisfaction levels, and psychological patterns using natural language processing.

Predictive Modeling: User data trains models to predict future behaviors, preferences, and needs – information invaluable for product development and monetization.

Cross-Platform Integration: Free AI services integrate with existing ecosystems (Google Workspace, Microsoft Office, social platforms) to create comprehensive behavioral maps across users' digital lives.

The Win-Win Facade

These partnerships are presented as mutually beneficial:

  • Users receive valuable AI services worth thousands of dollars

  • Telecom/Education providers offer enhanced value propositions

  • AI companies gain market penetration and user feedback

  • Society benefits from increased AI literacy and access

But this narrative obscures the fundamental asymmetry: users provide irreplaceable behavioral data worth far more than the AI services they receive in return.

The Dark Side of Data Dependency

Creating Digital Dependence

The true goal isn't just data collection – it's creating permanent dependence on AI systems. By offering powerful capabilities for free, companies ensure users integrate AI into their cognitive processes, decision-making, and daily workflows.

Students who rely on Gemini for research and writing become dependent on Google's AI ecosystem for intellectual tasks. Airtel users who use Perplexity for information discovery develop behavioral patterns centered around the platform's capabilities.

This creates switching costs far beyond simple subscription fees – users must retrain their cognitive habits and workflows to use alternative platforms.

Data Colonialism

The Perplexity-Airtel deal represents a form of digital colonialism where American AI companies extract behavioral resources from developing markets while concentrating AI capabilities and control in Silicon Valley. India provides 360 million data subjects while Perplexity maintains algorithmic sovereignty and model ownership.

This echoes historical resource extraction patterns where colonies provided raw materials while colonial powers retained manufacturing capabilities and value creation. In the AI age, human behavioral data has become the raw material, and algorithmic intelligence represents the manufactured product.

The Path Forward: Recognizing the Real Cost

Understanding the Exchange

The Perplexity-Airtel and Google-Students partnerships aren't acts of corporate generosity, they're sophisticated data acquisition strategies disguised as social good initiatives. Users aren't receiving free services; they're paying with their behavioral data, cognitive patterns, and future autonomy.

The true price includes:

  • Loss of cognitive privacy through comprehensive behavior monitoring

  • Psychological dependence on AI systems for intellectual tasks

  • Behavioral modification toward more predictable and monetizable patterns

  • Contribution to AI systems that may compete with human capabilities

  • Surrender of algorithmic sovereignty to foreign corporate entities

Building Awareness

Digital literacy must evolve beyond understanding privacy settings and data policies to recognizing behavioral manipulation and cognitive influence techniques. Users need frameworks for evaluating the true costs of "free" AI services and making informed decisions about cognitive autonomy and data sovereignty.

Educational institutions, policymakers, and civil society organizations must develop ethical frameworks for AI data collection that prioritize human agency, cognitive liberty, and democratic values over corporate profits and technological convenience.

Conclusion: The Choice Between Convenience and Autonomy

The AI revolution is fundamentally reshaping human-computer interaction and cognitive augmentation. But the current trajectory, dominated by surveillance capitalism and behavioral extraction – threatens to subordinate human intelligence to algorithmic control.

The Perplexity-Airtel and Google-Students partnerships represent pivotal moments in this transformation. They offer a glimpse into a future where AI assistance comes at the cost of cognitive autonomy, where technological convenience requires behavioral surveillance, and where human intelligence becomes raw material for corporate AI development.

The choice is still ours, but only if we recognize it. We can embrace AI augmentation while demanding cognitive privacy, algorithmic transparency, and human agency. We can benefit from AI capabilities while maintaining intellectual independence and behavioral sovereignty.

But first, we must see through the generous facades and democratization narratives to understand the true nature of the exchange. In the age of AI, there are no free lunches, only different ways of paying the bill.

The question isn't whether we'll use AI it's whether AI will use us. And that answer depends on whether we recognize the game while we still have the power to change the rules.

0
Subscribe to my newsletter

Read articles from Aroop Rath directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Aroop Rath
Aroop Rath

Level 30 gamer, part-time code wrangler, and certified gym regular. Not your go-to for small talk, but I'll beat you at Mario and maybe outlift you on leg day. Approach with snacks or patch notes for best results.