RAG: Super Serum for LLMs


RAG (Retrieval Augmented Generation) is a technique that attempts to fill a hole in LLM’s capability. Although LLMs are highly capable, however they do have a limitation . They all have a knowledge cutoff, which is the time they were last trained on. Hence they don’t have any knowledge of events that happened after the knowledge cutoff. Additionally, the LLMs are unaware of the data that wasn’t used for the training but is useful for your unique requirements like the financial data of your company stored in company’s database.
Rag bridges this gap by providing extra sources the LLMs can cite. Let’s see how does RAGs do this. It comprises of mainly two pipelines.
Indexing Pipeline: The external documents like PDF, webpage, etc. are first retrieved and broken down smaller chunks. This ‘Chunking’ is done because the LLMs have a limited context window. Which means we can only feed a limited amount of tokens to LLMs at once. These document chunks then converted to mathematical notations called Vector Embeddings. These embeddings are then stored in a Vector Database.
Retrieval-Generation Pipeline: This pipeline gets triggered when user asks a query. The query itself gets converted to vector embeddings too. Then a similarity search of the query is run with the Vector DB store. The search result gives the top documents that are most relevant to the query. Now these relevant documents are added to the system prompt before sending it along with the original query to LLM. The LLM then uses these external data as its primary source of information giving a much more relevant output. combining both the pipelines we get this:
Benefits of RAG:
It builds more trust on the LLM’s output as it is generated from the data you provided.
There is lesser chance of hallucinations.
This process is more efficient than retraining or even finetuning the model, because it takes significantly lesser time than the other two processes.
Subscribe to my newsletter
Read articles from Vishal Pal directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
