Understanding AI Jargons One Sip at a Time-My Beginner Journey into GenAI


The term AI jargon may sound intimidating at first but once you understand the intuition behind each term, everything starts to click I covered a bunch of these jargons and here’s a fun, beginner-friendly breakdown of what they really mean.
GPT – Generative Pre-trained Transformer
G: Generative – it generates text.
P: Pre-trained – it’s trained on a huge dataset before you even touch it.
T: Transformer – the architecture that powers it all.
💬 GPT is a type of **LLM (Large Language Model)**not an AGENT, which means it can answer questions only based on the data it’s already trained on.
ChatGPT = GPT + Agent
ChatGPT uses GPT at its core but also behaves like an agent it can take instructions, analyze intent, and respond naturally.
1)Transformers
Transformers are a type of neural network architecture that revolutionized NLP (Natural Language Processing). Introduced in the paper “Attention is All You Need” (2017), transformers process all tokens in parallel instead of one-by-one like older models.
They are the backbone of models like GPT, Gemini etc.
2)Encoder
The encoder is the part of a transformer that reads and understands the input.
Example: You type "The cat sat on the mat"
Encoder converts it into numerical representations (vectors) with context.
3)Decoder
The decoder takes those vector representations and generates an output, like a translated sentence, response, or summary.
📤 Example: It could turn your English sentence into French or into a chatbot response.
4)Vectors
Everything in AI is math. A vector is just a list of numbers that represents a word, sentence, or idea in a mathematical way.
Example:
“King” = [4, 6]
“Man” = [4, 3]
“Queen” ≈ “King” - “Man” + “Woman”
Vectors help the model "understand" relationships between words.
5)Embeddings
Embeddings are the technique of turning words into vectors. It maps words to points in space where distance = meaning.
Example:
Words like "happy", "joyful", "cheerful" will have similar embeddings — they’ll be close in vector space.
🔢 Vectors & Embeddings
Every word in your prompt is turned into a vector a list of numbers that represent its meaning.
Example:
Imagine this 2D vector chart from class:
So, if you subtract "Man" from "King", and add "Woman", you get something close to "Queen". This shows semantic meaning via vector embeddings.
6)Positional Encoding
Transformers don’t read left to right. So how do they know the order of words?
Positional encoding adds position-based information to tokens so the model knows “The cat sat” is different from “Sat cat the”.
7)Semantic Meaning
Semantic means “meaning.” Models try to capture what a word actually means in context.
Example:
- “Bank” in “River bank” vs. “ICICI Bank”
The model uses surrounding words to determine the semantic meaning of “bank”.
8)Self Attention
Self-attention lets each word look at every other word in the sentence to understand its meaning in context.
Example:
- In “The river bank,” self-attention helps the model connect “river” to “bank” and interpret correctly.
9)Multi-Head Attention
Instead of just one attention layer, multi-head attention allows the model to focus on different parts of the sentence at once.
One head might focus on subjects, another on verbs, another on adjectives — making understanding richer and deeper.
10)Softmax
Softmax is a function that turns numbers into probabilities. The final layer of most models uses it to decide what word to output next.
Example:
makefileCopyEditIt might say:
“Cat”: 80%
“Dog”: 15%
“Fish”: 5%
It will pick the highest one: “Cat”.
11)Temperature
Temperature controls randomness in text generation.
Low temp (e.g. 0.2)
→ Safe and predictableHigh temp (e.g. 0.9)
→ Creative and wild!
It’s like controlling how adventurous the AI should be in answering.
12)Knowledge Cutoff
LLMs aren’t connected to the live internet. They only “know” what they were trained on.
Example: ChatGPT’s knowledge cutoff is April 2023 — anything after that, it doesn’t know unless you feed it manually.
13)Tokenization
Tokenization is the process of breaking text into smaller pieces called tokens.
Example:
“The cat sat.” → ["The", "cat", "sat", "."]
Each token is later converted into a number using a vocabulary dictionary.
Challenge 1: Exploring tiktoken
with GPT-4o
In our first challenge, we delved into OpenAI's tiktoken
library, which provides tokenization compatible with models like GPT-4o.
Steps Undertaken:
Setting Up the Environment
bashCopyEditpython3 -m venv venv source venv/bin/activate pip install tiktoken
Creating
requirements.txt
txtCopyEdittiktoken
Developing
tokenization.py
pythonCopyEditimport tiktoken encoder = tiktoken.encoding_for_model('gpt-4o') print("Vocab Size:", encoder.n_vocab) text = "The cat sat on the mat" tokens = encoder.encode(text) print("Tokens:", tokens) decoded = encoder.decode(tokens) print("Decoded:", decoded)
Output:
yamlCopyEditVocab Size: 200019
Tokens: [976, 9059, 10139, 402, 290, 2450]
Decoded: The cat sat on the mat
This exercise provided insights into how pre-trained models tokenize text and the importance of consistent encoding and decoding.
14)Vocab Size
Vocab size is the total number of unique tokens a model can recognize.
Example:
GPT-4 uses ~200,000 tokens.
Each token = a word, part of a word, or even punctuation.
Larger vocab size = better understanding, but also more computational cost.
📝 Final Thoughts
Learning AI from scratch felt intimidating at first, but breaking down these jargons in simple terms made everything easier. I now understand how my sentence becomes a set of numbers, how it’s transformed, and how AI decides what to say next.
Subscribe to my newsletter
Read articles from Harshali Kadam directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
