Generative AI models like GPT and LLaMA are great at producing human-like text, but they often suffer from LLM hallucination — confidently providing answers without actually verifying with real time actual data. That’s where Retrieval-Augmented Generation (RAG) comes in.

RAG combines retrieval systems (search-like) with generative models to ground AI’s answers in real, external knowledge.

Before generating a response, it retrieves relevant information from an external knowledge base (e.g., documents, PDFs, databases, or the web).

This ensures answers are:

More accurate
More precise
Up-to-date (no longer limited by training date cut-off)
Customizable (can use domain-specific knowledge, like insurance policies or medical documents)

RAG is used because:

LLMs don’t know real time/ latest data.
Storing all knowledge inside a model would be too costly and inefficient.
Many industries (finance, law, healthcare) need reliable, explainable, and source-backed answers.

RAG has two key components:

Retriever → Finds the most relevant documents from a knowledge base.
Generator → Uses the LLM to produce a response, conditioned on the retrieved documents.

Indexing is the process of organizing documents so they can be searched efficiently.

In RAG, indexing typically means:

Splitting documents into smaller pieces (chunks)
Converting those chunks into vector embeddings (numerical representations of meaning)
Storing them in a vector database (like Pinecone, Weaviate, or FAISS

When a query comes, the retriever quickly finds the most relevant vectors (chunks).

LLMs don’t understand raw text the way humans do. Instead, they work with vectors — high-dimensional numerical representations that capture semantic meaning.

Vectorization makes semantic search possible, so the retriever can find conceptually similar chunks, not just keyword matches.

Documents are often too long to feed directly into an LLM. So, we break them into smaller, manageable pieces called chunks.

If chunks are split strictly, important context might get lost between boundaries.

INTRO TO RAGs

Subscribe to my newsletter

Hrishith Savir

Hrishith Savir