Decoding AI Jargons with Neo

Tokenization — “The Matrix is made of code. Tokenization is how I read it.”

🗣️ Neo:
"When I first saw the code, it looked like chaos. But then I understood—it’s all made of tokens. Just like breaking the Matrix into readable parts, tokenization breaks language into smaller units—words, sub words, or even characters.

It’s how machines start to 'read' us. Tokenization is the red pill for data—making the abstract, visible."

It’s how AI breaks down text into smaller pieces — like words or parts of words. This helps the model understand and process language more effectively.

import tiktoken

encoder = tiktoken.encoding_for_model('gpt-4o')
text = "I'm Neo"
tokens = encoder.encode(text)
print("Neo in AI World is", tokens) # Neo in AI World is [15390, 69594]

This is how tokenization looks like in basic terms, you can play it in the playground as well.

Encoder & Decoder — “Two sides of the same coin.”

🗣️ Neo:
"Encoders absorb the truth. Decoders rebuild it. Like Morpheus and I—we gather knowledge, then we act."

Encoder

The part of the model that reads and understands the input text. It turns words into numbers that capture their meaning.

text = "I'm Neo"
tokens = encoder.encode(text)
print(tokens) #[15390, 69594]

Decoder

The part that takes those numbers and turns them back to words.

neo_tokens = [15390, 69594]
decoded = encoder.decode(neo_tokens)
print(decoded) # I'm Neo

Vocab Size — “The more words you know, the more control you have.”

🗣️ Neo:
“Vocab size is the breadth of your universe. Bigger vocab, better nuance.

But remember—understanding doesn’t come from size. It comes from meaning.”

When Neo downloads kung fu into his brain and opens his eyes wide—"I know kung fu"—that’s like expanding a model’s vocabulary. Suddenly, he can express and understand complex combat moves he never knew before.

In AI, vocab size is how many tokens a model knows—words, subwords, symbols. Bigger vocab = more precise language.

Some models know 32K tokens, others 100K+, which affects how well they handle slang, code, or poetry.

But just like Neo—it’s not about how much you know, it’s how you use it. 🥋

Vector Embedding — “I don’t dodge words. I feel their direction.”

🗣️ Neo:
“In the Matrix, everything moves with purpose—bullets, agents, thoughts.
Words are no different. Vector embeddings give them direction and position in space.
It’s not about what a word is—it’s about where it is and how it moves among others.
That’s how machines understand. Not by reading… but by feeling the flow.”

A vector embedding is how a machine represents meaning with numbers. It turns words, sentences, or images into a list of numbers (a vector) so AI can understand and compare them—like coordinates in the Matrix.

you can visualize it here in TensorFlow projector.

Example:

“Neo" → [0.98, -0.33, 0.22, ...]
"Trinity" → [0.96, -0.34, 0.25, ...]

They’re close in numbers → because they share a deep bond.

Think of it like this:

🗣️ Human:

"Neo and Trinity are partners."

🤖 AI (via embeddings):

"The vectors are close. They mean similar things."

let me show it via code as well

from dotenv import load_dotenv

from openai import OpenAI
import os

load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")
client = OpenAI(api_key=api_key)

text = "Neo feels Trinity is in danger as Agent Smith appears."

response = client.embeddings.create(
    input=text,
    model="text-embedding-3-small"
)

print("Vector Embeddings", response.data[0].embedding) 
# Vector Embeddings [-0.03760385140776634, 0.07071041315793991, 0.005594520829617977, -0.014209812507033348, 0.012387868016958237, 0.031887415796518326, .........

In The Matrix, Neo sees the world as raw code—numbers, patterns, and relationships in a 3D space. Each vector embedding from the Open AI API can be thought of as a point in a high-dimensional space (though not strictly 3D, it's often visualized this way for simplicity).

Each number in the vector is like a coordinate in this space, representing a different feature of the sentence’s meaning.
Just as Neo learns to interpret the Matrix as a 3D structure of data, the AI models use these points in space to understand the meaning and relationships between words, like Neo’s feelings, danger, or Agent Smith.

In essence:

Neo sees the Matrix as a world of points and relationships in 3D space.
Embeddings are numerical points in a high-dimensional space that represent the meaning behind the sentence, helping the AI understand language more deeply.

Positional Encoding — "Order in Chaos"

🗣️ Neo:
“Without position, words are chaos. Positional encoding tells the model where a word belongs.

Imagine the sentence: ‘There is no spoon.’ Swap the words and you lose the truth.”

Positional Encoding in The Matrix:

In The Matrix, Neo begins to see the world as code—and how each piece of code changes depending on its position. This is like Positional Encoding in AI, which helps the model understand that the same word can mean different things depending on where it appears in a sentence (semantic meaning).

Example from the Movie:

Neo says, “Trinity is in danger.”
- Here, “danger” means a personal threat to Trinity.
Later, Neo says, “Agent Smith is here.”
- Now, “danger” means a threat from Agent Smith, who’s an external enemy.

The position of “danger” and who it’s linked to changes the meaning—just like how Neo learns that the position of code in the Matrix changes the reality he experiences.

Just like in the spoon-bending scene, Neo learns that changing his perception of the code alters reality. Similarly, Positional Encoding changes the meaning of words based on their position in a sentence, just as Neo changes the spoon's reality by adjusting his perception.

Conclusion:

Positional Encoding works the same way, helping the model understand the context of words based on their position in the sentence—just like Neo reading the Matrix code and interpreting the meaning based on where he is.

Transformers — “Like Neo in full awareness mode.”

🗣️ Neo:
“I used to follow paths. Now I see all of them—at once.”

Think of Neo standing in the Matrix, surrounded by code. He's not just reading line by line—he sees every piece of the Matrix interacting in real-time. That’s exactly how Transformers process language or data.

Instead of going word-by-word like old RNNs or soldiers in a line, Transformers look at everything at once—and figure out what matters most.

Each word gets to ask:
“Who should I pay attention to?”
And then does it—smartly and in parallel.

Self-Attention — "Focus is everything."

🗣️ Neo:
"In a fight, I don’t waste time on everything. I focus on what matters—what’s coming at me next."

Self-Attention helps words update their semantic meaning by focusing on their most relevant relationships— just like Neo interpreting the Matrix in real time.

In The Matrix, Neo doesn’t react to everything—he focuses. During a fight, he zeroes in on what matters most: a punch from Agent Smith, a gun aimed at Trinity, a movement in the shadows.

That’s exactly how Self-Attention works in AI.

Each word in a sentence doesn't just sit there—it looks at all the other words and decides which ones are most relevant to its meaning. It’s not just about the word—it’s about who it's connected to.

It’s the perfect visual metaphor: everything slows down, the noise fades, and only the important tokens (or bullets 👀) remain in focus.

Multi-head Attention — “Why choose one focus, when you can have many?”

🗣️ Neo:
“Different heads, different perspectives. Multi-head attention lets models look at different meanings at the same time.

Like fighting multiple agents—each move needs a different strategy.”

In The Matrix Reloaded, Neo fights dozens of Agent Smiths. If he only focused on one at a time, he’d be overwhelmed. But by switching strategies for each Smith, he manages the chaos.
Multi-head Attention works the same way—each "head" looks at a different part of the input, allowing the model to understand multiple aspects of the data at once, just like Neo handling multiple threats.

SoftMax — “Choice. It’s the illusion of control.”

Neo:
“SoftMax gives probabilities. Which word comes next? Which path to take?
It’s how the machines make decisions—controlled randomness. But always leaning toward what makes sense.”

SoftMax in The Matrix:

Imagine Neo standing in front of a wall of code, each line representing a possible path. Some paths lead to victory, others to chaos—but only one has the highest probability.

Example in Neo’s World:

Morpheus offers Neo a choice—the red pill or the blue pill. Each pill represents a different outcome with different probabilities:

Red Pill (Path 1): Leads to the truth, chaos, and the unknown (90% probability of facing Agent Smith).
Blue Pill (Path 2): Leads to staying in the Matrix, peace, and ignorance (10% probability of staying in comfort).

Just like SoftMax, the Matrix calculates the probabilities and guides Neo toward the red pill, which has the higher likelihood of revealing the truth, even though it’s risky and dangerous.

SoftMax doesn’t pick randomly—it picks the most probable outcome based on the information available. Neo, much like an AI model, is directed to take the path that’s most likely to deliver results—the red pill.

SoftMax is Neo’s decision-making instinct in a world of choices—guided by probabilities, it leans toward what’s most likely, just like the red pill leading to a world of chaos and discovery.

Temperature — “How wild do you want the Matrix to get?”

🗣️ Neo:
“Temperature controls chaos. Lower means safe. Higher means creativity, risk.

Want a predictable response? Keep it cool. Want something unexpected? Heat it up.”

In The Matrix Revolutions, the final battle in Zion is chaos—machines attacking, explosions everywhere, and complete unpredictability. This is like turning up the temperature in AI—higher temperature means more randomness and risk, just like the battle raging out of control.

But by the end of the movie, Neo lowers the temperature. Instead of fighting fire with fire, he cools things down by negotiating with the machines—offering a solution to end the war. By keeping his cool, he brings about a peaceful resolution. In AI, cooling things down means more predictable, stable results—just like Neo's calm negotiation to control the chaos.

Knowledge Cutoff — “Time stops—for the machines.”

🗣️Neo:
“Every AI has a moment where it stops learning. That’s the knowledge cutoff. It’s like freezing time.
I knew the Matrix until 2199. Beyond that, I had to discover myself.”

In the Matrix, time froze at 2199. Neo's world became static, unchanging. Just as AI models have knowledge cutoffs, marking limits to their learning, Neo faced a moment where external guidance waned, pushing him toward personal growth and exploration. No longer guided solely by the Matrix's data, Neo embarked on a quest to understand his true purpose and the broader universe beyond the digital confines.

Moral:
Just as AI models have knowledge cutoffs, marking limits to their learning, individuals too face moments where external guidance wanes, pushing them toward personal growth and exploration.

Closing: — “But it’s not!”

🗣️Neo: "Understanding the Matrix isn’t about seeing—it’s about believing. These concepts? They’re the code behind the illusion. Learn them, and you too will bend reality.”

Similarly, GenAI learners are about more than just processing data—they’re about uncovering patterns, pushing boundaries, and creating new realities. As you explore this world of intelligent machines, remember: Master the code behind the algorithms, and you too can unlock the future of innovation.

Decoding AI Jargons with Neo 🕶️

Table of contents