My First Blog with Chai&Code

Aniket kindreAniket kindre
4 min read

Hello Reader,

I have never ever written a blog before and I do not have any idea how to write one.

So, I have decided to write the blog in Hinglish or so, because that is what makes it different from others, and in the future, if someone reads it by any chance, they will understand this jargon easily. Or maybe it will be used to train the next data with. So increasing online human mind data. Of course, ChatGPT ( now LLM models after this course ) can always write or think better on any topics in any tone, but lets keep giving this a human touch. TBH I am using Grammarly here, which is also a part of the Predictive model. Working in a marketing agency, I know that embedding keywords can anytime will rank this blog at the top, but here we are not selling anything, but creating a Human diary (seriously, we are in this stage right now that we have to say Human before any other thing that cannot be generated by Ai). So let’s begin…

The very first question to be answered is:

What is AI?

Artificial Intelligence (ofc) is nothing but not few but huge lines of code written to train models on existing data. And a fun fact is we are at peak for training these models, and all data will be exhausted by 2026. That means no new fresh data will be available for the model to train data, and the data available will be just synthetic data ( Data that is already trained and is in use as output from these models).
Ref: https://www.theguardian.com/technology/2025/jan/09/elon-musk-data-ai-training-artificial-intelligence

Okay, this is what AI is! Now, AI can be used for various reasons, but here, as a part of my assignment, we will start from our daily life and the most used platform, CHATGPT: Our go-to tool. {“Bhai yeh code error solve nahi ho raha hai!”} {“Are ChatGPT karke dekh le naaah!!!”} So…

What is ChatGPT?

ChatGPT, as the name suggests, to chat with GPT in human-like Language. It is an LLM model developed by OPENAI platform. But then what is GPT?

GPT stands for Generative Pretrained Transformer.

Let’s break down this word by word:

  1. Generative ( Basic English ) to generate something. Still, for example, Google. Google is not a generative tool it is a search engine which indexes the pages from all over the world. But as of GPT can generate new text based on your input. It’s like asking a human to write a paragraph for you, tailored exactly to your request.

  2. PreTrained ( Little Hard English ) is to train data. But but but it cannot train data as you go, because just imagine if Millions of people are entering their bio data on a portal and millions of lines of code is to be run in the background, it will be very hard for the machine to process such big data and you might have heard that using of these models to give output are already melting the GPU’s which are already advance version of CPU’s so training this data and using them simultaneously doesn’t work as of now. That is the reason these models are pretrained on existing data.
    Ghibli Trend and Melting of GPUs

    Now, how to know till what time span these data are trained on? So different platform has different ways to showcase this thing. For example, in the Gemini model, we have https://aistudio.google.com/prompts/new_chat where, when you select an old model and hover over it, you can see a tab like Knowledge cutoff, which is Aug 2024, in the Snapshot attached.

    That means for this model, the last time it was trained was on Aug 2024. Now again, why did they not update this or train again on new data? Yes, they did train, but now they just sugarcoated it to a new name model new thing ( Just investors’ thing ). Nowadays we have such models that don’t need to be trained but are integrated with search engine to get the latest data updated.

  3. Transformer 🤖🚗 ( You get the Point 😉)

    Yes, the heart of the GPT model and the core architecture. It’s not a regular word it’s a big deal in AI. The transformer model was introduced in Google’s 2017 paper titled Attention Is All You Need. ( Nice name btw to grab attention of World ).

    In this paper, as usual, a hard-to-read paper ( complexity: Maths in English ), we are introduced to how this architecture works using the following terms:

    a. Encoder

    b. Decoder

    c. Attention

    d. Position-wise Feed-Forward Networks
    e. Embeddings and Softmax
    f. Positional Encoding

Also check out this video from Google cloud :https://youtu.be/SZorAJ4I-sA?si=hUYPH1hViy9x1Y50

Okay now we’ll further decode this Jargons in the Next blog:

0
Subscribe to my newsletter

Read articles from Aniket kindre directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Aniket kindre
Aniket kindre