Decoding AI

Table of contents
- 🤖 My Limitations
- 🥞 No True Understanding: I don’t “understand” like humans do; I generate responses based on patterns.
- 🥞 Context Limits: My context window is large but finite, so very long conversations might lose earlier details.
- 🥞 Bias and Errors: My responses reflect patterns in my training data, which can occasionally lead to inaccuracies or biases.

“Your dost and your career ghost”
🤖 Hi, I’m your AI
Don’t be afraid of me i am like your personal Doraemon😼,I'm here to help you, just like Doraemon helps Nobita, i am not a job replacer or a automation threat on this planet. My mission is to: “Zindagi sawar doon Ik nayi bahar doon Duniya hi badal du main To pyara sa chamatkaar hoon 🌬️“ (Make your life better, bring a new spring, change the world, and be a lovely miracle).
But, fair warning, I might take your job if you don’t practice what you learn 😤.
So in this blog we will learn how a cute blue raccoon cat (yes, that’s me!) understand what you type in my textbox and how i generate you output. So, let’s dive in!
➡️ This is me, this is what happens internally in me when you give me Prompt
A complex process unfolds behind the scenes when you provide me with a prompt. Here, I break down the behind-the-scenes (BTS) magic and explain what happens internally:
INPUT PROCESSING (TOKENIZATION)
😸 Your prompt is broken down into smaller units called tokens. These can be words, letters, punctuation marks, or other elements.
😸 Every AI (LLMs like ChatGPT, Grok, DeepSeek, Claude etc) uses a tokenizer to convert your text into numbers (or you can say assigns a unique ID to each token).
😽 Example:
VECTOR EMBEDDING (Contextual Understanding)
😸 After tokenization, each tokenized unit is converted into embeddings.
😸 Embeddings are numerical vectors (list of numbers) that captures the meaning and context of each token.
😸 Embeddings helps AI to understand relationship between words and their context in the Prompt.
😽 Example: “Doraemon” and “Dorami” share a semantic relationship and have similar embeddings.
➡️ Semantic Relationship:- “A semantic relationship is the connection between words, phrases, or concepts based on their meaning”.
😽 Examples:
🩵🩵 "Big" and "large"(Similar Meaning)
🩵🩵 "Hot" and "cold" (Opposite Meaning)
🩵🩵 "Wheel" and "car" (A part-whole relationship)
🩵🩵 "Bank" as a financial institution or river edge (context-dependent meaning)➡️Positional Encoding: This is a method to help me understand the order of words in a sentence. Since I process all words simultaneously, I don’t naturally know which word comes first, second, or later. Positional encoding assigns each word a special “position tag” to indicate its place in the sequence.
😽 Example: “The dog chased the cat” versus “The cat chased the dog” have different meanings due to word order, which positional encoding helps me distinguish.
NEURAL NETWORK PROCESSING (Transformer Magic)
😸 After embedding process done, embeddings will go into my body known as transformer architecture, the core of modern language. Here i “thinks” about your prompt.
➡️➡️ How my core architecture works internally?? Lets see:-)
⚙️ Attention Mechanism: I am using self-attention mechanism to understand the importance of each token in relation to others. Here i allowed every token to talk to each other to understand and learn each others meaning in respect to their context.
⚙️ Layers of refinement: The input passes through many layers and each layer refines your prompt to its best possible outcome.
⚙️ Context Window: Its like my memory which i use to understand and respond to you. It’s the amount of text (words, sentences, etc.) I can "look at" at one time to keep track of our conversation or process what you send me.
GENERATING A RESPONSE
😸 After processing the prompt, i predicts the next best tokens to form a response.
➡️➡️ How i predicts the best tokens(i am not a magician or god)???
⚙️ I have a massive dictionary and i used it for the next token.
⚙️ I select a token and repeats the process to build the response token by token.
⚙️ The output tokens are converted back into human-readable text.
🤖 My Limitations
🥞 No True Understanding: I don’t “understand” like humans do; I generate responses based on patterns.
🥞 Context Limits: My context window is large but finite, so very long conversations might lose earlier details.
🥞 Bias and Errors: My responses reflect patterns in my training data, which can occasionally lead to inaccuracies or biases.
Subscribe to my newsletter
Read articles from Vinny Madaan directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
