AI Agency Explored: NotebookML's Insights

In the rapidly evolving landscape of artificial intelligence, we often find ourselves captivated by the seemingly intelligent responses of large language models (LLMs). But beneath the surface of these sophisticated systems lies a fundamental truth: AI lacks true agency.

The Illusion of Self

AI agency is a seductive concept—the belief that these complex systems possess a genuine sense of self or identity. However, this is nothing more than an elaborate illusion. LLMs are, at their core, intricate next-token prediction engines, masterfully disguised through fine-tuning and conversational interfaces.

The Mechanics Behind the Mask

After pre-training, these models are essentially prediction machines. Fine-tuning provides a veneer of instruction-following and conversational ability, but it's merely a surface-level adjustment. Reinforcement learning from human feedback (RLHF) modifies the outermost layers of the model, creating the impression of understanding and agency.

A Practical Exploration: NotebookML

To understand these limitations, I turned to NotebookML, a popular tool for generating podcast scripts. This platform perfectly illustrates the challenges of creating multi-agent content when true agency is absent.

NotebookML, Audio Overview feature, transforms multiple source content into a discussion between two AI hosts who analyse, connect topics, and engage in casual banter. While technologically impressive, this automated podcast generation serves as an ideal case study for examining the boundaries of artificial agency.

The Challenge of Multiple Personas

When tasked with generating a script featuring two hosts, NotebookML reveals the fundamental weaknesses in AI's ability to maintain distinct identities. The AI struggles with what psychologists call "theory of mind"—the capacity to attribute mental states to different agents.

In my personal experience, I observed subtle yet telling inconsistencies: hosts would inadvertently adopt each other's personas, answer their own questions, or even misalign with the intended voice.

Listen below a podcast example, pay attention right at the start 0:12 and a bit later at 0:51, how Host-1 asks a question and then responds herself breaking the natural order twice:

https://open.spotify.com/episode/22L3ArrVFgeANRcXRwpwlS

https://codepen.io/gsans/pen/NWQQMzG

Where AI Falls Short

The primary limitation stems from treating the entire script as a single, undifferentiated unit. Without the ability to maintain separate, coherent internal states, the AI frequently produces scripts with:

Role confusion
Inconsistent persona maintenance
Unexpected voice switching

A Path Forward: External Runtime Management

To address these challenges, I propose an innovative approach: an external runtime system dedicated to managing host states and interactions. This system would:

Maintain dynamic states for each host
Ensure consistent role adherence
Provide real-time interaction management
Allow scalable, complex conversation scenarios

How It Would Work

By introducing an API-driven external runtime, we can offload the complexity of state management from the AI itself. This system would track prior contributions, validate consistency, and dynamically adjust host interactions.

Conclusion: Recognizing the Limits

While AI continues to advance at a remarkable pace, we must remain clear-eyed about its current limitations. The appearance of agency is just that—an appearance. By understanding these constraints, we can develop more nuanced, effective approaches to AI-generated content.

The journey of AI is not about creating artificial beings with true self-awareness, but about developing increasingly sophisticated tools that augment human creativity and communication.

Unmasking the Myth of AI Agency Through NotebookML

Table of contents