Decoding AI jargons with Chai

Varun saxenaVarun saxena
8 min read

🧠 AI Madness: Why Everyone’s Talking About ChatGPT and Job Fears

This has become a major buzzword in recent times. Ever since ChatGPT came into the picture, everyone seems to have lost their minds — worried that AI might take over their jobs.But what is this ChatGPT more importantly what’s this term A.I and why are people so worried? Let’s find out..

Artificial Intelligence is a human-driven effort to make machines conscious and intelligent like us.

Over the past hundred years, many mathematicians and computer scientists have developed statistical techniques and advanced algorithms in an attempt to mimic human intelligence — but even today, we’re still nowhere close to fully replicating it.

But there have been several advancements in this field that make it seem more advanced and abstract, causing people to worry about their jobs.


🔍 Let’s Get Nerdy: Breaking Down the Brain of AI

You can think of AI as one big, complex problem. To make it easier to solve, we break it down into smaller chunks — like Machine Learning and Deep Learning — which together make up what we call AI.

Machine Learning is a sub-field of A.I where we use statistical techniques to make computer able to learn on it’s own without being explicitly programmed.

Machine learning itself is a vast subject, and I won't dive too deep into it right now. But to give you a rough idea — it learns from existing data and its outcomes, gradually adjusting itself to minimize errors over time.

Most problems in AI were already being tackled using Machine Learning, so you might wonder — why do we even need Deep Learning? Well, ML works great for relatively simple problems using algorithms like linear and logistic regression and many more. But when it comes to complex, real-world challenges where ML starts to fall short, Deep Learning steps in. Interestingly, those basic ML algorithms actually become the foundational building blocks for Deep Learning models.

Deep learning is a technique that can partially mimic the workings of neuron inside human brain.It is used when machine learning fails to solve complex problems like image classification, speech recognition etc.


🧠 Language, Tokens & Imagination: The Brains Behind ChatGPT

After getting a bit of overview about basic A.I concepts, we are ready to learn and understand about this really advance technology that everyone is feared of “ChatGPT”.

In 2022, ChatGPT was first released for public and as soon as it launched everyone was amazed how well it used to make our day to day life easy.But how is it different from all the concepts that we discussed above and what makes it different?

Well GPT stands for “Generative Pre-Trained Transformer” which is a kind of trained model that is used in building this chat application.This all comes under one of the niches of deep learning called NLP (Natural Language Processing).

Natural Language Processing (NLP) is a branch of AI that mainly deals with text data. Using machine learning algorithms and deep neural networks, NLP helps machines understand human language and predict what might come next in a sentence.

For example, given some context about a situation or problem, NLP tries to find patterns in the text and predict the most probable next word.
Suppose the sentence "How are you?" is given as a prompt to ChatGPT — the most likely response could be "I'm doing well."
That prediction comes from learning patterns in massive amounts of text data.

Now, behind the scenes, this involves a lot of complex math and probability — but at a high level, that’s the essence of what NLP does.

So how is GPT different from just NLP?
Well, think of GPT as an evolution of NLP. While NLP focuses on understanding and predicting language, GPT (Generative Pre-trained Transformer) takes things further. It combines:

  • Agentic AI principles (to behave more like an intelligent assistant),

  • The Transformer architecture (a powerful deep learning model), and

  • Pre-training on huge datasets (to learn grammar, context, facts, and reasoning).

This combo allows GPT to not just predict the next word — but to generate full sentences, paragraphs, or even stories, all while staying coherent to the given context.


🪙 Token by Token: How Language is Fed to AI Models

All of the prompt given to GPT is first converted into tokens.

Example:
“Hello my name is Varun Saxena”
→ Tokenized version: “Hello”, “my”, “name”, “is”, “varun”, “saxena”
Each word in the sentence is split into individual tokens.

Now i now that some of you guys might think what is the need of splitting a sentence, how is it going to help? Well machine or deep learning models understands data in form of vectors using which they try to predict the output, so converting sentences into word (tokens) is a very convenient process which will further help us to convert them into vectors.

After Tokenisation, the words are then converted into vectors using machine learning techniques like one hot encoding, bag of words, n-grams, or TF-IDF.Now most of these techniques aren’t used in industry level projects as in most of these vectorisation techniques the meaning os the original sentence is not getting captured.The point of vectorising sentences is to be able to feed it to the model while keeping the original meaning of the sentence intact.

Now here comes transformers.

Before Transformers, models struggled with long sentences or remembering context. But Transformers made it possible for AI to understand the meaning of words in context, no matter where they appear in a sentence — and that’s HUGE.

Imagine you’re reading a sentence like:

“I went to the bank to sit by the river.”

The word “bank” could mean money bank or river bank — and only by looking at the whole sentence can you tell which one.
Traditional models struggled here.

But a Transformer looks at all the words in the sentence at once — not just the ones nearby — and tries to understand how they all relate to each other.

This magic is done using a concept called self-attention and vector embedding.


⚙️ Inside the Transformer: How GPT Understands Context

Now comes the most crucial part and almost the final part in understanding how the GPT works. Inside transformer a whole pipeline is working, where all the the text is first preprocessed, cleaned, tokenized and finally vectorized for meaningful mathematical representation.

Vector Embeddings

Here is one of the most important step comes i.e, vector embedding.

Before we understand this concept, lets first understand what is a vector. Well vector is nothing but a way to represent numbers in a co-ordinate space.Machine learning model are trained faster and easier when given the input in form of vectors. Vectorization also reduces the time complexity for training a machine learning model.

Now embedding is a different thing, an embedding is any numerical representation of data that captures its relevant qualities in a way that ML algorithms can process. Now why is it important? We discussed above various techniques that convert corpus into tokens and vectors but none of them were able to grasp the meaning of the sentence, which is a huge problem! If the model doesn’t get the context or the meaning behind the give statement/prompt, how will it be able to predict.

Embedding makes it possible to truly capture the essence of the sentence and then convert it accordingly to vectorized format.

The core logic of vector embeddings is that n-dimensional embeddings of similar data points should be grouped closely together in n-dimensional space.

The best measure of similarity for a specific situation depends largely on the nature of the data and what the comparisons are being used for.

  • Euclidian distance measures the average straight-line distance between the corresponding points of different vectors. The difference between two n-dimensional vectors a and b is calculated by first adding the squares of the differences between each of their corresponding components—so, (a1–b1)2 + (a2–b2)2 + ... (an–bn)2 —and then taking the square root of that sum. Because Euclidian distance is sensitive to magnitude, it’s useful for data that reflects things like size or counts. Values range from 0 to ∞.
  • Cosine distance, also called cosine similarity, is a normalized measure of the cosine of the angle between two vectors. Cosine distance ranges from -1 to 1, in which 1 represents identical vectors, 0 represents orthogonal (or unrelated) vectors, and -1 represents fully opposite vectors. Cosine similarity is used widely in NLP tasks.
  • Dot product is, algebraically speaking, the sum of the product of the corresponding components of each vector.

Here is the visual representation of all meaningful words.

🔍 What is Self-Attention?

It is a technique that helps the model focussing on specific words without losing context. It helps the model capture relationships between words, no matter how far apart they are.

Let’s take this sentence:

“The animal didn’t cross the road because it was too tired.”

Here, the word “it” refers to “the animal”, not “the road”.
Self-attention helps the model understand that relationship, even though those words are far apart.

🧾 Conclusion: Language Meets Intelligence

In this article, we explored how ChatGPT, powered by GPT (Generative Pre-trained Transformer), is more than just a text generator — it's a product of some of the most advanced deep learning techniques in Natural Language Processing.

We began with the basics of NLP, moved into what makes GPT unique, and then dived into the self-attention mechanism — the heart of the Transformer architecture. This mechanism enables GPT to deeply understand the context of a conversation and generate intelligent, coherent responses.

What makes GPT revolutionary is its ability to generate, not just understand — making it useful in applications like coding assistants, creative writing, education, and more.

As you continue your AI journey, remember:

The magic lies not just in the model, but in the data, architecture, and the deep mathematics that powers it all.

11
Subscribe to my newsletter

Read articles from Varun saxena directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Varun saxena
Varun saxena