Easy Understanding of GPT

Generative AI isn’t just a passing trend anymore — it’s becoming a part of everyday life. Those who know how to use it effectively are finding themselves more productive, more creative, and sometimes even more emotionally supported. People turn to AI as a work assistant, a personal coach, a late-night conversation partner, or even a friend during tough times.

While some obscurely guess it mimics the human brain and has emotions, others see it as nothing more than an advanced FAQ tool. In this article, I’ll break things down as simply as possible. The core theory and implementation behind it are very complex, but I’ll focus on the core ideas to help you understand how it works.

GPT(Generative Pre-trained Transformer) is a type of large language model (LLM) that uses a technique called the attention mechanism, introduced by Google in their 2017 research paper, "Attention Is All You Need." Google laid the foundation, but OpenAI developed the first GPT model, GPT 1, in 2018(yeah, I know ChatGPT was released in 2022 with GPT-3).

Intelligence comes from Learning, and Learning requires a lot of knowledge. Machines also need to acquire a lot of knowledge by learning to achieve Artificial Intelligence.

So, how do these models learn?

It learns in two main steps: pre-training and fine-tuning. During pretraining, it learns grammar, facts, reasoning abilities, and nuanced language patterns in massive amounts of text by trying to predict the next word in a sentence recurrently, like completing — The cat sat on the ___., adjusting its billions of internal parameters (numerical weights) to get better at this task. Then, when you give it a prompt, it breaks the input into tokens, processes them all at once, uses what it learned to generate a relevant and coherent response token by token, and then generates a response one token at a time, based on what it thinks is the most likely next word.

What are those token things, man?

Tokens are small pieces of text, sometimes full words, parts of words, or even punctuation, depending on how the model breaks them down.

Well, what are these parameters?

A parameter is like a tuning capability.

For example, a chef who has read and practiced thousands of recipes. So, over time, they develop a sense of mixing, how much salt, spice, sweetness, or cooking time to work best in different dishes. A parameter is like the chef’s taste preference; it can vary from one to another, like humans. That’s why each AI is slightly different, though they are on the same technology.

Let’s get back to the chef again. When the chef gets a request for a new kind of dish, they obviously don’t follow the fixed recipe every time. They adjust the taste based on his experience, which feels right for him at that moment, like the same dish but maybe a little more garlic here, a little less salt there, depending on what you asked. GPT works the same way when it responds.

Its billions of parameters are like this chef’s internal instincts, each parameter is like a tiny dial inside the model that adjusts how much weight it gives to certain patterns in language. When you give GPT a prompt, it generates it from its taste memory (pretraining) and uses its parameters to decide what sounds most appropriate at each step. That’s why even two different AI models built on similar technology can respond differently. They’ve been trained slightly differently, and their tastes (parameters) are not the same.

But wait, how does GPT understand the context of the query in that level of accuracy?

This is where the attention mechanism comes into play. It helps GPT figure out which words in your input are more important than others by assigning them more weight when generating a response. It doesn’t just read left to right. It reads the whole sentence together and evaluates how each word relates to the others.

For example:

I went to the bank to deposit money.

The word — bank is the main word here. But it can mean a riverbank or a financial institution. How does GPT know which one you meant?

It looks at surrounding words, like deposit and money, and gives those more attention. So, it concludes, Ah, this person is talking about a financial bank. This is what we call understanding context. Where humans observe the context emotionally, GPT does it statistically.

But how does GPT keep up a conversation like a human? Does it have memory?

Nope. GPT doesn’t really have memory like humans do. It doesn’t remember past talks like your friend might. Even Core GPT models don’t have a built-in memory feature at all. It’s provided by the platforms with their techniques and facilities, so it may vary.

But that’s the generic technique most of the platforms use — simply reads the entire conversation history as plain text, every time you say something in a single conversation. It processes all the words so far and responds based on that, but there’s a limit to how much GPT can see at once.

For example, you’re chatting with a friend over text. You both forgot where you started. You both scroll up to see what was said earlier. GPT does this too, it keeps the full conversation in view but within a certain length and replies based on that. That window of memory is called the context window. Each platform has a different size of context window, and it gets better day by day.

If the conversation gets too long or jumps between too many topics, GPT can get confused or forget earlier parts. Because GPT prioritizes/gives attention to which topic is the dominant or latest topic the user is talking about.

For example, let’s have a dummy conversation -

You: What’s the capital of Italy?

GPT: Rome.

You: Can you suggest tourist spots there?

GPT: (Gives tourist spots in Rome)

You: Is Indian Dosa great?

GPT: Sure! It’s delicious.

You: I wanna try Hyderabadi Biryani.

GPT: Okay, here’s how to get to Hyderabad…

(Now you’ve shifted topics multiple times — Italy → Dosa → Biryani)

You: Umm… I was thinking about Rome, you know. Tell me the rest.

GPT: Rome is a great city… (But it completely forgets you already asked for tourist spots earlier!)

This happens because the latest parts of the conversation get more attention. GPT’s not confused — it just works with the latest dominant topics in the context window.

If the conversation gets too long or jumps around too much, older details fall outside the window, and GPT can’t see them anymore unless you remember them.

Wait a minute.. if GPT doesn’t have any memory, then why does GPT feel damn personal?

That’s the cool part. Even without having long-term memory, GPT can feel personal. It’s not GPT alone; it’s smart tuning and prompt engineering behind the scenes.

When you casually drop personal facts in the chat (like “I love red,” “I’m a pharmacist,” or “I get sad at night”), the system picks these up within the current session and tries to tailor replies accordingly.

Imagine it like this:

GPT writes down a few sticky notes about you during the conversation:

- Likes red

- Works in pharma

- Feels lonely at night

Then it uses those notes to respond in a way that feels personal, even though it’s not storing them permanently.

However, some products like ChatGPT can actually remember facts across sessions. For example:

- Your name

- Your profession

- Your goals or preferences

These are stored in a memory file, make a profile of yourself, like so, GPT remembers next time.

Also, experts can fine-tune the GPT on their data in most of the platforms, think of it like training a new employee with your company policy handbook. That means GPT can -

- Speak in a specific tone (like your company’s voice)

- Know internal processes (like organizational work)

- Stay consistent for specific users (like customer support)

So you can see, GPT isn’t just smart, it has to be large and powerful to maintain accuracy. More parameters, more accuracy. Though these enterprise-grade models already have billions or even trillions of parameters in production applications to serve you.

Training and tuning them means running massive math operations on enormous amounts of text data, which requires thousands of high-end GPUs working in parallel, often for weeks or months. That’s why it’s so expensive to build, and why companies charge for access to the most powerful versions.

So next time you chat with a GPT AI, remember it’s not magic, it doesn’t understand emotion, it’s not conscious. You can think it is an insanely good pattern matcher. It is trained on more text than any of us could read in a double lifetime. It doesn’t know you. But with the right words, it can still feel like it does.

And maybe that is the most human part of all!

Easy Understanding of GPT: The Brain Behind the AI Revolution

Subscribe to my newsletter

Dip Chakraborty

Dip Chakraborty