The Secret Language of AI Tokens

Vaidik JaiswalVaidik Jaiswal
2 min read

One day, a friend asked me:

“How does ChatGPT actually read what I type?”

I smiled. “It doesn’t read letters the way we do. It reads numbers.”

The Secret Language Club (What Tokenization Is)

Imagine you’re part of a secret language club.

The rule is:

You can’t speak in full words - only in special code numbers.

  • “Cat” isn’t “cat” anymore - it might be 532.

  • “Kitten” might be 1843.

  • Even a giant word like Supercalifragilisticexpialidocious gets chopped into smaller pieces, each with its own number.

That process - turning words or parts of words into numbers - is called tokenization.

It’s how AI translates human language into something it can do math with.

Each AI Has Its Own Dictionary

Here’s the twist:

Every secret club has its own dictionary.

  • In ChatGPT’s dictionary, 532 might mean “cat.”

  • In Gemini’s dictionary, 532 might mean “banana.”

That’s why tokenization is model-specific - each AI has its own vocabulary and codebook.

Not Always Whole Words

You might think one token = one word. Nope.

  • Short words can be one token (“dog” → 912).

  • Long words get split into many tokens (“playground” → 731 + 485).

  • Even spaces and punctuation can have their own codes.

It’s like the club giving you a code for “apple” and a code for the space after “apple.”

Why Not Just Use Words?

Because computers don’t understand words - they understand numbers.

Numbers can be calculated, compared, and stored efficiently.

Tokenization is the bridge between our language and the AI’s mathematical brain.

The Takeaway

Tokenization is:

  1. Breaking text into chunks (tokens).

  2. Assigning each chunk a number from the AI’s vocabulary.

  3. Following the AI’s own private dictionary, which is different for every model.

So the next time you type into ChatGPT, remember - before it “understands” you, it’s busy translating your words into its secret number language.

0
Subscribe to my newsletter

Read articles from Vaidik Jaiswal directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Vaidik Jaiswal
Vaidik Jaiswal