When Chatbot Training Data Goes Wrong, Millions Pay the Price

Gerard SansGerard Sans
6 min read

We live in an era of digital miracles. Ask ChatGPT to write you a poem, and it complies with eloquent verses. Query Claude about quantum physics, and it responds with sophisticated explanations. The technology feels seamless—until you peek behind the curtain and see the machinery that makes it possible.

To understand what’s at stake, you need to know how these systems actually work. Companies like OpenAI (ChatGPT), Google (Gemini), and others don’t train their models in-house. Instead, they outsource much of this critical work to specialized data companies like Scale AI, which operates through subsidiaries like Outlier.

Here’s how the process works: Scale AI hires thousands of freelance trainers—experts in various fields—to create high-quality examples of human reasoning, conversation, and problem-solving. These examples become the “training data” that teaches language models how to generate appropriate responses. Think of it like creating comprehensive reference materials that algorithms use to pattern-match and produce outputs.

The quality of these human-created examples directly determines how accurate and useful the system’s outputs will be. If the training data is excellent, the model generates thoughtful, accurate responses. If the training data is corrupted or low-quality, the system produces unreliable outputs—confidently and at scale.

Recent whistleblower accounts from within the AI training industry reveal a troubling reality: the very foundation upon which these systems rest is being systematically eroded by the companies building them. The flicker of trouble isn’t an abstract fear—it’s being brought into sharp focus by voices from within the industry itself. Daniel Flint, a former AI trainer who worked for Scale AI’s subsidiary Outlier, has taken to TikTok (dan_da_dani) to expose what he describes as systematic failures in the AI training pipeline. His viral posts detail alleged experiences creating datasets for major language models, painting a picture of an industry where quality controls have broken down and human expertise is being systematically devalued. Flint’s testimony serves as a crucial window into the hidden world of AI training—a multi-billion dollar industry that operates largely out of public view despite its outsized influence on the technology millions use daily. What emerges is a picture of an industry so focused on scaling quickly and cutting costs that it’s poisoning its own well.

The Human Cost of Artificial Intelligence

The story begins with the people we rarely see—the AI trainers who create the datasets that teach our digital oracles how to think. These aren’t just data entry workers; they’re the skilled professionals who provide the examples of human reasoning, empathy, and wisdom that AI systems learn to mimic.

But according to insider accounts, the industry is treating this crucial workforce as expendable:

  • Massive pay cuts: Quality manager rates reportedly slashed from $35 to $15 per hour

  • Broken automation: Human reviewers replaced by systems that allegedly fail qualified trainers while approving incorrect work

  • Compromised sources: Trainers instructed to use Wikipedia and social media posts as authoritative references for major AI models

If true, this last point represents a breathtaking abdication of quality control—like asking chefs to prepare gourmet meals using only gas station ingredients.

The Philosophical Crisis Behind the Headlines

These operational failures point to a deeper conceptual problem. Philosopher John Vervaeke offers a framework that helps explain why the current approach is fundamentally limited. Human knowledge, he argues, exists in multiple dimensions:

  • Propositional knowing (“knowing that”): Facts, statements, information

  • Procedural knowing (“knowing how”): Skills, techniques, embodied expertise

  • Perspectival knowing (“knowing what it’s like”): Experience, context, empathy

  • Participatory knowing (“knowing by being”): Wisdom gained through engagement with the world

Current AI systems excel at the first type—they’re remarkable pattern-matching machines that can process vast amounts of propositional knowledge. But they cannot access the other three dimensions, which are arguably where human wisdom truly resides.

The cruel irony is that the industry depends entirely on human trainers to capture echoes of these deeper forms of knowing in text form. When a language model appears to show empathy or understanding, it’s because a human trainer once demonstrated those qualities in the datasets. We’re essentially trying to replicate the outputs of human cognition by recording and processing fragments of human-generated examples.

Yet the economic incentives are driving companies to devalue and discard the very humans who provide this irreplaceable input.

The Coming Reckoning

This isn’t sustainable, and the cracks are already showing. AI systems regularly generate false information, struggle with basic logical reasoning, and fail to grasp context that would be obvious to any human. These aren’t bugs to be patched—they’re symptoms of a fundamentally flawed approach to training these systems.

The industry is experiencing what technologists call “garbage in, garbage out,” but on a massive scale. When you systematically degrade your training pipeline through:

  • Replacing skilled human judgment with broken automation

  • Instructing trainers to use unreliable sources

  • Cutting costs at every level of quality control

The output will inevitably suffer. More concerningly, this creates a vicious cycle. As AI-generated content floods the internet, future models will increasingly train on synthetic rather than human-generated text. Each generation risks becoming further divorced from authentic human reasoning and wisdom.

A Path Forward

This isn’t an argument against AI—it’s a plea to save this transformative technology from its own worst impulses. The current trajectory is unsustainable, but it’s not irreversible.

First, the industry must recognize that data quality is not a cost center to be optimized away—it’s the foundation everything else rests on. The humans who create training datasets need to be:

  • Valued as skilled professionals

  • Compensated fairly for their expertise

  • Empowered with robust quality control systems

You cannot build trustworthy AI on the backs of exploited workers using broken processes.

Second, we need honesty about current limitations. The transformer architecture is remarkable for what it does, but it’s not artificial general intelligence. It’s a sophisticated prediction engine that excels at pattern matching but struggles with genuine logical reasoning, causal understanding, and contextual comprehension. Acknowledging these limits is the first step toward transcending them.

Finally, the industry needs to invest seriously in new approaches that might bridge the gap between pattern matching and genuine intelligence. This means supporting research into architectures that can handle abstraction, causal reasoning, and the integration of multiple forms of knowledge.

The Stakes

The AI industry stands at a crossroads. Down one path lies continued hypergrowth built on increasingly shaky foundations—impressive demos that mask fundamental flaws, systems that seem intelligent but regularly fail in unpredictable ways. Down the other lies slower but more sustainable progress toward genuinely robust and trustworthy AI.

The choice isn’t just about technology—it’s about the kind of future we’re building. Do we want:

  • AI systems that are flashy but unreliable, built on the exploitation of human expertise?

Or…

  • Technology that genuinely augments human capability, created through processes that honor both the complexity of intelligence and the dignity of the people who make it possible?

The voices from within the industry are trying to warn us that the current approach is unsustainable. The question is whether we’ll listen before the foundation crumbles entirely.

We acknowledge that specific claims about individual companies’ practices described in this piece remain allegations and have not been independently verified.

0
Subscribe to my newsletter

Read articles from Gerard Sans directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Gerard Sans
Gerard Sans

I help developers succeed in Artificial Intelligence and Web3; Former AWS Amplify Developer Advocate. I am very excited about the future of the Web and JavaScript. Always happy Computer Science Engineer and humble Google Developer Expert. I love sharing my knowledge by speaking, training and writing about cool technologies. I love running communities and meetups such as Web3 London, GraphQL London, GraphQL San Francisco, mentoring students and giving back to the community.