Decoding AI Jargons with Chai

Taha IftikharTaha Iftikhar
4 min read

Transformer : -

  • Transformer ko hmm ek aisi machine samjh skte h jo human language samjhti and generate kr skti h

  • Iska kaam hota hai input ko samajhna aur output banana

  • First time it was introduced in “Attention All You Need” Research Paper published by Google

  • Example - ChatGPT, Google Translate

Encoder : -

  • Encoder ka kaam hota hai input sentence ko samajhna

  • like - > Input —> Understand krke —> Vector bnana

  • Jaise agar main likhon “I love coding”, to encoder har word ko vector mein convert karta hai aur context ke saath encode karta hai.

  • In AI terms - Encoder user inputs ko tokens main convert krta h

  •   import tiktoken
      encoder = tiktoken.encoding_for_model("gpt-4o") # using gpt-4o model 
    
      text = "hello ! I love coding and cricket"
      token = encoder.encode(text)
      print("Token",token) # [200264, 17360, 200266, 3575, 553, 261, 10297, 29186, 200265, 200264, 1428, 200266, 24912, 1073, 357, 3047, 22458, 326, 57976, 1, 200265, 200264, 173781, 200266]
    

Decoder : -

  • Decoder woh part hota hai jo encoder ke vector ko use karke output generate karta hai.

  • Jaise agar encoder ne “I love coding” ko samjha, to decoder usko translate kar sakta hai “Main coding karta hoon” mein.

  • Basically , ye tokens ko wapas input sentence main convert krta h

  •   import tiktoken
      encoder = tiktoken.encoding_for_model("gpt-4o")
    
      my_token = [200264, 17360, 200266, 3575, 553, 261, 10297, 29186, 200265, 200264, 1428, 200266, 24912, 
    
      decoded = encoder.decode(my_token) 
      print("Decoded token",decoded) # hello ! I love coding and cricket"
    

Vector Embeddings : -

  • ye input text ko unke semantic meaning main convert krta h

  • like in the above example - for the king , there is Queen and like that What would be for the Man , Woman , you guessed right

Semantic Meaning : -

  • ye kisi bhi word ka context ke basis pe meaning decide krta h

  • “He went to the bank to deposit money.” → bank = financial

  • “She sat by the bank of the river.” → bank = river side

  • Semantic meaning changes with context

Positional encoding : -

  • Transformer mein koi sequence order nahi hota — isliye model ko batana padta hai ke kaunsa word kis position par hai.

  • Jaise 1st word, 2nd word, etc. Iske liye hum positional encoding add karte hain — basically har word ke vector ke saath ek position-based vector add karte hain.

      Ex - 
      1. The cat sat on the mat 
      2. The mat sat on the cat
      encoder dono ke liye same tokens generate krega so the question is that 
      How will it differentiate between these two sentences 
      isiliye Positional encoding use hoti h jo har ek word ki position decide krti h
    

Self-Attention : -

  • Ye mechanism help karta hai ek word ko sentence ke baaki words ke context ke hisaab se samajhne mein.

  • Jaise “She poured water in the jar and put it in the fridge.” Yahan “it” kis cheez ko refer karta hai? — self-attention helps in figuring that out.

Softmax :-

  • Softmax ek mathematical function hai jo logits (kuch random numbers) ko probabilities mein convert karta hai.

  • Model ke output ke end mein use hota hai — taaki pata chale kaunsa word ya class zyada likely hai.

  •   Output logits: [2.0, 1.0, 0.1]
      Softmax converts to: [0.65, 0.24, 0.11]
      (Most likely = first class)
    

Multi head Attention : -

  • Single attention sirf ek pattern dekhta hai — multi-head attention multiple patterns ek saath dekh sakta hai. Jaise ek head subject pe dhyaan de raha ho, doosra object pe — sabko combine karke better understanding milti hai.

  •   One head focuses on subject, one on verb, one on object
      Sentence: “The cat sat on the mat.”
      Head 1: “cat” ↔ “sat”
      Head 2: “sat” ↔ “mat”
    

Temperature : -

  • Ye output text main word randomness ko decide krta h

 Low temperature = confident & repetitive
 High temperature = creative & thoda random
  • Ex - Prompt: “Once upon a time...”

    Temp 0.2 → “...there was a king who ruled wisely.”

    Temp 1.0 → “...aliens landed with disco lights and pineapples.”

Knowledge Cutoff : -

  • AI models jaise ChatGPT ki training ek certain date tak ke data par hoti hai.

  • Us date ke baad ke data ka knowledge nahi hota — usi ko bolte hain knowledge cutoff.

  • Jaise ChatGPT ka knowledge cutoff hai June 2023 (ya April 2024 for newer versions).

  •    ChatGPT (2021): Doesn’t know about ChatGPT-4
       ChatGPT (2024): Doesn’t know about events in 2025
    

Tokenizations : -

  • Ye text ko small small words, subwords main cut krta h

  •   Sentence: “unbelievable”
      Tokenized into: ["un", "believ", "able"] (subwords)
      “I’m fine” → ["I", "’m", "fine"]
    

Vocab Size : -

  • Numbers of tokens or word an AI knows

  • Chatgpt - vocab size = ~ 200k tokens

  • Claude also has same size vocab

Summery - :

Generated image

Thanks for reading

Let’s Connect Guys - @DexterIfti

6
Subscribe to my newsletter

Read articles from Taha Iftikhar directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Taha Iftikhar
Taha Iftikhar