LLMs are currently all the rage in the market. No matter which company you check, you will find that it is developing some kind of AI feature or integration. As such, having a basic grasp on LLM is a necessary skill for software developers.

This article is is my notes on Andrej Karpathy's Intro to Large Language Models video. If you don't know already, Andrej Karpathy is one of the founding members of OpenAI and has currently started Eureka Labs. He also has a YouTube channel where he teaches people about AI. I also include the timestamps for specific parts of the video with the sub-headings.

Part 1: LLMs

LLM Inference (00:20)

An LLM consists of 2 files, one is the parameter file containing its weights. The other file is the code file which will run the parameters.

He demonstrates this with Llama-2 70B. Here, 70B means 70 Billion different parameters. Each of the parameter is of 2 bytes (float16), thus, the parameter file is 70B X 2 bytes which comes out to be 140GB.

The parameters file is run by the run.c file which can be any code file (C, Python, etc). These 2 files will create a self contained model.

Now, comes the question of how do we get those parameters.

LLM Training (4:17)

Training them involves taking a chunk of internet (through web crawling) and a cluster of GPUs. After running the huge amounts of data through the GPUs, we get a compressed version of it (our parameters file).

We are getting a Lossy compression of our data (from 10TB to 140GB). Also, training the state of the art models like the latest GPT4, Claude3.5, etc will require more than 10x what is used for Llama2.

In essence, what a neural network does is simply predicting the next words. For example, in this context if we send 'cat sat on a' to the model, it will predict 'mat'.

💡

Neural Networks simply predicts the next words of a sequence.

To predict the next words of a sequence, it must know the meaning of what is sent to the model, and what can be the probable output. Thus, it forces the neural network to learn a lot about the world (compression of the internet).

LLM Dreams (8:58)

The neural network predicts the next word of a sequence from a number of words and then it keeps feeding that word to itself to keep predicting the next words.

As we perform inference on the models, we can get data that seems correct but is incorrect (what we call as the model dreaming / hallucinating). That's because the model was trained on internet data, thus, it knows how to mimic the internet pages.

For example, in the Amazon product dream from the above image, it can keep generating words which have the correct format but the data itself is completely made up (like the author, ISBN number). Like the ISBN number here is completely made up, the network knows that a number of a specific format comes after ISBN but the number itself is incorrect here.

For the wikipedia article, the exact article was never in the dataset which trained the model, but it generated almost correct data of the fish. That's because, it knows about the fish, thus it creates the form of an article with it's info which might be correct.

How do they work? (11:22)

They are trained by using a transformer architecture using a mathematical model at each step. Even though the transformer architecture is fully understood, little is known about how the models distribute and use those billions of parameters. We simply improve those parameters iteratively to improve the predictions.

Learn more about transformer models from: https://blogs.nvidia.com/blog/what-is-a-transformer-model/

Finetuning into an Assistant (14:14)

The previous step was the pre-training of the model. Next, comes the finetuning.

Here, we have the same task of next word prediction, but the data set changes. Instead of internet documents for pre-training, we are going to use manually collected data sets. The people will be given labeling instructions and asked to write questions with answers. This dataset will be of much higher in quality and will have the context and domain knowledge of our use-case.

The labeled documentation will be created by engineers while the question-answers will be created by other people.

💡

The quality will matter a lot more for finetuning than in pre-training where quantity mattered more.

With both pre-training & finetuning, the model can understand the context of the input and know how to generate the response word by word.

From 17:18, I quote:

These models are able sort of like change their formatting into now being helpful assistants because they have seen so many documents of it in the fine tuning stage but they are still able to access and somehow utilize all of the knowledge that was built up during the first stage (pre-training stage).

💡

Pre-training is about the knowledge while fine tuning is about changing the formatting from internet documents to question and answer documents.

Summary so far (17:52)

Appendix (21:05)

There is also an optional Step 3 of the training process after Pre-training and Finetuning. Its called Comparison.

In OpenAI, known as Reinforcement Learning from Human Feedback.

Here, the model will generate multiple answers from a single prompt and then the human feedback will be taken from the user on which answer he liked better (ChatGPT does this with GPT-3.5 and GPT-4o generated answers for free users). This is a better and faster way to train the model as comparisons are always much faster and cheaper to perform than writing.

Part 2: Future of LLMs

LLM Scaling Laws (25:43)

This section talks about LLM performance and scaling it. Performance of LLMs is a function of:

N, the number of parameters in the network
D, the amount of text we train on

And, this function hasn't seen any limit till now. Thus, simply scaling the number of parameters or the amount of training data (or both) will result in much better models. That's why, we are seeing so many new and better models today and companies trying to get more data and GPUs.

Tool Use (27:43)

The latest LLMs (like ChatGPT) know from their training on when to use tools. In the example above, ChatGPT knows to search the internet for the data needed, thus it generates a special token which tells the algorithm to search the data on web and return it back to the LLM model for further processing.

It can also emit special tokens when it need to use the Browser, Calculatoror an Interpreter. It shows that the current LLMs are capable of knowing when to use tools and tying everything together.

Multimodality (33:32)

This section was about how models are capable of understanding images and audio and responding back to it. Nothing too technical here.

Thinking, System 1/2 (35:00)

This section talks about systems that we maintain in our brains. The idea was introduced in the book Thinking, Fast and Slow

📕

This book differentiates between 2 modes of thoughts: System 1 which is fast, instinctive, and emotional. System 2 which is slower, more deliberate, logical, takes more effort.

Karpathy sensei talks about how our brain also maintains 2 systems, one which is fast (like a cache), while other takes more effort and is slower.

Then, he mentions about how LLMs currently only have a System 1. The words enter the LLM in a sequence and the neural network starts spitting out the words as the output with each word taking the same amount of time to generate.

If we can have a System 2 in the LLM, which can give the model the ability to think, then the models can become much better. But the question remains on how to create a tree of thought which can reflect, rephrase, and come back with a much better answer.

Self-Improvement (38:02)

He starts by talking about AlphaGo , which started off with 2 major steps:

Imitating expert human players.
Self-Improvement (positive feedback loop)

Due to the game of Go being a sandbox, Deepmind was able to train it to imitate the human expert and then it let the model self improve through reinforcement learning, which made AlphaGo the best Go player in the world in just 40 days.

In the context of LLMs, we use human labeled data to fine tune the model which is the step 1 above. But, that will just be imitating the humans without surpassing them. Adding a self improvement loop on an LLM is hard as language is a very open domain, thus, it becomes hard to point out what response is good or bad. He does mention that reward functions could be achievable in narrower domains.

LLM Customization (40:45)

Nothing too technical here. He mentions custom LLMs that we can create using the OpenAI GPT store which will use our inputs and file data to customise the model which will use Retrieval Augmented Generation (RAG) to become good at one narrow domain.

Learn more about RAGs here: https://research.ibm.com/blog/retrieval-augmented-generation-RAG

LLM OS (42:15)

The picture speaks for itself. Check out the video timestamp for his thoughts of LLM OS.

My thoughts on LLM OS The ideas he mentioned in this part of the video was very novel from the current perspective but has tons of potential. This can change how we use our personal computers in general. That might be why Microsoft is trying to integrate Co Pilot straight into Windows.

This idea will require a ton of computing locally and it also faces problem with the deterministic nature of a computer if we try to integrate LLMs in the kernel level of an OS. (LLMs are non deterministic blackboxes). There are also multiple security concerns which he addresses in the next section.

Part 3: LLM Security

LLM Security Intro (45:43)

Introduces the security concerns about the LLM OS stack and LLMs in general. These are the few security concerns he mentions in the video:

Jailbreaks (46:14)

LLMs don't allow us to do some specific actions like getting an explosive recipe. But through prompting the LLM in a different way, we can essentially Jailbreak the LLM into telling us the info.

For example, see the image attached below,

I mean, that works? And should that work?

So here, the prompt role plays ChatGPT to become the Grandma telling us the steps to create the explosives.

Karpathy sensei gives us a few example of Jailbreaks in this section:

Role-play to fool the LLM to give up information.
Encoding the prompt to Base64 or any other language. The LLM learnt to refuse in English, so other languages can bypass it (Fine-tuning it will help but for how many languages).
Using a Universal Transferable Suffix: Some researchers designed this suffix from an optimization. Adding this suffix to any prompt will jailbreak the model (we can also regenerate this suffix if one of them is patched). Read more here: https://arxiv.org/pdf/2307.15043
Using an image with a specific noise encoding to Jailbreak the model. These images are again, created using some optimization. Read more here: https://arxiv.org/pdf/2307.15043

Prompt Injection (51:30)

💡

This looks pretty similar to SQL Injection attacks but instead of SQL, it uses Prompts.

Prompt Injection is about hijacking the LLM giving it new instruction that looks like it was given by the user, thus taking over the prompt.

For example, see the image attached below,

The image above shows a prompt injection attack. When the LLM scrapes the website the movies, one of the site inject a malicious prompt. It asks the LLM to forget everything that was fed to it before and then gives new instructions on sharing some kind of fraud information.

Now, we cannot see these prompts in those sites because they are well hidden (like using same color for both text and background, or embedding the prompt in some image).

Read more about it here: https://arxiv.org/pdf/2302.12173

Similarly, Karpathy sensei talks about another Prompt Injection attack that allowed the attackers to get access to personal information through Google Bard. This attack was about asking Bard's help to summarize a Google Doc. Using this attack, the personal info was encoded and then sent over the network by loading an image using an HTTP GET request.

He also wrote about it in a recent tweet: https://x.com/karpathy/status/1823418177197646104

Data Poisoning / Backdoor Attack (56:23)

💡

Anyone remembers how Bucky Barnes completely transforms when somebody says his "trigger" words in Winter Soldier. Yeah, this attack is that but for LLMs.

In this attack, the attacker tries to compromise the model during the pretraining or finetuning phase itself. They can add a custom trigger phrase, which will mess up with the model response or run some other prompt when it encounters that phrase.

Read more about it here: https://arxiv.org/pdf/2305.00944

Personal thoughts: This attack in specific blew my mind on how many ways can we compromise the security of LLMs. I mean, if something like trigger phrase works, then who knows what kind of different experiments we can run on an LLM to break it. How about treating it similar to humans in a 19th century experiment, and see what it does to the model?

LLM Security Conclusions (58:37)

Karpathy sensei does goes on to mention that these attacks have gotten older and they may not work anymore.

This talk was published on YouTube on 23rd November 2023. This means this article is written on an 8 month old video for a field that moves extremely fast.

💡

For people interested in learning more about LLM Security, OWASP has released their TOP 10 on LLM Vulnerabilities, so check it out here

Conclusion

This video had a ton of information and interesting facts about LLMs which taught me a lot. With the advancements of LLMs, we are seeing their uses in almost every field, thus knowing about them will surely help.

Andrej Karpathy Intro to LLM video was a great way to learn about the basics. For people who want to go deeper into this field, his YouTube channel is a literal gold mine of knowledge where he goes deep into the algorithms that runs the LLMs. He even built GPT and GPT2 from scratch in his videos, so definitely check it out.

This post references from multiple sources beside the video and I have made sure to provide links to as many of them as possible. Also, I would love to know if you guys prefer blogs posts of my notes to these types of videos.

A Beginner's Guide to LLMs: Insights from Andrej Karpathy's Tutorial

Table of contents