Decoding AI Jargons with Chai


Transformer : -
Transformer ko hmm ek aisi machine samjh skte h jo human language samjhti and generate kr skti h
Iska kaam hota hai input ko samajhna aur output banana
First time it was introduced in “Attention All You Need” Research Paper published by Google
Example - ChatGPT, Google Translate
Encoder : -
Encoder ka kaam hota hai input sentence ko samajhna
like - > Input —> Understand krke —> Vector bnana
Jaise agar main likhon “I love coding”, to encoder har word ko vector mein convert karta hai aur context ke saath encode karta hai.
In AI terms - Encoder user inputs ko tokens main convert krta h
import tiktoken encoder = tiktoken.encoding_for_model("gpt-4o") # using gpt-4o model text = "hello ! I love coding and cricket" token = encoder.encode(text) print("Token",token) # [200264, 17360, 200266, 3575, 553, 261, 10297, 29186, 200265, 200264, 1428, 200266, 24912, 1073, 357, 3047, 22458, 326, 57976, 1, 200265, 200264, 173781, 200266]
Decoder : -
Decoder woh part hota hai jo encoder ke vector ko use karke output generate karta hai.
Jaise agar encoder ne “I love coding” ko samjha, to decoder usko translate kar sakta hai “Main coding karta hoon” mein.
Basically , ye tokens ko wapas input sentence main convert krta h
import tiktoken encoder = tiktoken.encoding_for_model("gpt-4o") my_token = [200264, 17360, 200266, 3575, 553, 261, 10297, 29186, 200265, 200264, 1428, 200266, 24912, decoded = encoder.decode(my_token) print("Decoded token",decoded) # hello ! I love coding and cricket"
Vector Embeddings : -
ye input text ko unke semantic meaning main convert krta h
like in the above example - for the king , there is Queen and like that What would be for the Man , Woman , you guessed right
Semantic Meaning : -
ye kisi bhi word ka context ke basis pe meaning decide krta h
“He went to the bank to deposit money.” →
bank = financial
“She sat by the bank of the river.” →
bank = river side
Semantic meaning changes with context
Positional encoding : -
Transformer mein koi sequence order nahi hota — isliye model ko batana padta hai ke kaunsa word kis position par hai.
Jaise 1st word, 2nd word, etc. Iske liye hum positional encoding add karte hain — basically har word ke vector ke saath ek position-based vector add karte hain.
Ex - 1. The cat sat on the mat 2. The mat sat on the cat encoder dono ke liye same tokens generate krega so the question is that How will it differentiate between these two sentences isiliye Positional encoding use hoti h jo har ek word ki position decide krti h
Self-Attention : -
Ye mechanism help karta hai ek word ko sentence ke baaki words ke context ke hisaab se samajhne mein.
Jaise “She poured water in the jar and put it in the fridge.” Yahan “it” kis cheez ko refer karta hai? — self-attention helps in figuring that out.
Softmax :-
Softmax ek mathematical function hai jo logits (kuch random numbers) ko probabilities mein convert karta hai.
Model ke output ke end mein use hota hai — taaki pata chale kaunsa word ya class zyada likely hai.
Output logits: [2.0, 1.0, 0.1] Softmax converts to: [0.65, 0.24, 0.11] (Most likely = first class)
Multi head Attention : -
Single attention sirf ek pattern dekhta hai — multi-head attention multiple patterns ek saath dekh sakta hai. Jaise ek head subject pe dhyaan de raha ho, doosra object pe — sabko combine karke better understanding milti hai.
One head focuses on subject, one on verb, one on object Sentence: “The cat sat on the mat.” Head 1: “cat” ↔ “sat” Head 2: “sat” ↔ “mat”
Temperature : -
Ye output text main word randomness ko decide krta h
Low temperature = confident & repetitive
High temperature = creative & thoda random
Ex - Prompt: “Once upon a time...”
Temp 0.2 → “...there was a king who ruled wisely.”
Temp 1.0 → “...aliens landed with disco lights and pineapples.”
Knowledge Cutoff : -
AI models jaise ChatGPT ki training ek certain date tak ke data par hoti hai.
Us date ke baad ke data ka knowledge nahi hota — usi ko bolte hain knowledge cutoff.
Jaise ChatGPT ka knowledge cutoff hai June 2023 (ya April 2024 for newer versions).
ChatGPT (2021): Doesn’t know about ChatGPT-4 ChatGPT (2024): Doesn’t know about events in 2025
Tokenizations : -
Ye text ko small small words, subwords main cut krta h
Sentence: “unbelievable” Tokenized into: ["un", "believ", "able"] (subwords) “I’m fine” → ["I", "’m", "fine"]
Vocab Size : -
Numbers of tokens or word an AI knows
Chatgpt - vocab size = ~ 200k tokens
Claude also has same size vocab
Summery - :
Thanks for reading
Let’s Connect Guys - @DexterIfti
Subscribe to my newsletter
Read articles from Taha Iftikhar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
