RAG Demystified: Create your own Simple RAG

Vedant SwamiVedant Swami
4 min read

Have you ever wanted to chat with your PDFs like they’re your personal assistants? Whether you're skimming cheat sheets, research papers, or documentation β€” this blog shows you how to build an intelligent chatbot that answers questions from any PDF using Retrieval-Augmented Generation (RAG).

This is your complete guide to setting it up on Windows, with LangChain, Qdrant, and Google Gemini β€” all in under 100 lines of Python!

πŸš€ What You’ll Build

A terminal-based chatbot where you can ask:

What is React?
Summarize the useEffect hook.

And get intelligent, accurate responses from the actual content of the PDF β€” thanks to RAG.

πŸ“š What is RAG (Retrieval-Augmented Generation)?

RAG stands for:

  1. Retrieval – Search the most relevant chunks from your PDF using a vector database.

  2. Augmented – Inject those chunks into the prompt.

  3. Generation – Let a language model generate context-aware responses based on those chunks.

This architecture helps LLMs like Gemini give accurate answers without hallucinating.

🧠 The Two Main Phases

πŸ” Indexing Phase (One-Time)

  • Load PDF

  • Split it into manageable chunks

  • Embed chunks into vector format

  • Store them in Qdrant vector DB

πŸ’¬ Retrieval Phase (Each Query)

  • Accept a question

  • Search Qdrant for similar chunks

  • Combine them with your query

  • Let Gemini generate a response

🧱 Project Structure

pdf-chat-rag/
β”œβ”€β”€ .venv/                     # Python virtual environment
β”œβ”€β”€ docker-compose.yml         # Qdrant Database config
β”œβ”€β”€ requirements.txt           # Python dependencies
β”œβ”€β”€ main.py                    # Core app logic
β”œβ”€β”€ pdfs/
β”‚   └── React CheatSheet.pdf   # Add The PDF You want to chat to
└── .env                       # Store Your Google API Key Here

πŸ”§ Step-by-Step Setup

βœ… Step 1: Create Project & Activate Virtual Env

python -m venv .venv
.venv\\Scripts\\activate

βœ… Step 2: Install Dependencies

Create requirements.txt and add all the dependencies

langchain
langchain-community
langchain-google-genai
qdrant-client
python-dotenv
pypdf

Run the following command to install all dependencies

pip install -r requirements.txt

βœ… Step 3: Add Your API Key

Create a .env file:

GOOGLE_API_KEY=your_google_genai_api_key

βœ… Step 4: Start Qdrant with Docker

Create docker-compose.yml file and add Qdrant Config

services:
  qdrant:
    image: qdrant/qdrant
    ports:
      - "6333:6333"

Open a new Terminal & Start Qdrant:

docker-compose up

βœ… Step 5: Add a PDF

Put your file in the pdfs/ folder:

pdfs/React CheatSheet.pdf

βœ… Step 6: Code the App β€” main.py

  1. Import the dependencies

     from pathlib import Path
     from dotenv import load_dotenv
     import os
    
     from langchain_community.document_loaders import PyPDFLoader
     from langchain_text_splitters import RecursiveCharacterTextSplitter
     from langchain_google_genai import GoogleGenerativeAIEmbeddings, ChatGoogleGenerativeAI
     from langchain_qdrant import QdrantVectorStore
     from langchain_core.prompts import PromptTemplate
    
     load_dotenv()
     GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
    
  2. Load The PDF & Split it into chunks

     pdf_path = Path("pdfs/React CheatSheet.pdf")
     loader = PyPDFLoader(file_path=pdf_path)
     docs = loader.load()
    
     text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=200)
     split_docs = text_splitter.split_documents(docs)
    
  3. Create Vector Embeddings

     embedder = GoogleGenerativeAIEmbeddings(
         model="models/text-embedding-004",
         google_api_key=GOOGLE_API_KEY,
     )
    
  4. Store It In Database

     vector_store = QdrantVectorStore.from_documents(
         documents=split_docs,
         embedding=embedder,
         url="<http://localhost:6333>",
         collection_name="pdf_rag_chat"
     )
    
  5. Set Up LLM

     llm = ChatGoogleGenerativeAI(
         model="gemini-2.0-flash",
         google_api_key=GOOGLE_API_KEY
     )
    
  6. Define the SYSTEM PROMPT

     SYSTEM_PROMPT = """
     You are a smart PDF assistant. Answer user queries using only the provided PDF excerpts.
    
     - For **summaries**, give a brief overview of key points.
     - For **specific questions**, extract and present relevant info directly.
     - For **explanations**, start with a simple overview, then add detail if needed.
     - If the info isn't in the excerpts, reply: "The PDF does not contain this information."
    
     Be clear, concise, and avoid unnecessary jargon. Structure your answers to match the user's intent.
     If query is unclear, ask user to clarify the question once again
     """
    
     prompt_template = PromptTemplate(
         input_variables=["query", "excerpts"],
         template=SYSTEM_PROMPT +
                  "\\n\\nUser Query: {query}\\n\\nRelevant PDF Excerpts:\\n{excerpts}\\n\\nAssistant:"
     )
    
  7. Create an Interactive Chat-Loop

     while True:
         query = input("Ask a question about the PDF (or type 'exit' to quit): ")
         if query.lower() in ["exit", "quit"]:
             print("Goodbye!")
             break
    
         docs = vector_store.similarity_search(query, k=3)
         context = "\\n\\n---\\n\\n".join([doc.page_content for doc in docs])
         full_prompt = prompt_template.format(query=query, excerpts=context)
    
         response = llm.invoke(full_prompt)
         print("\\nAssistant:", response.content, "\\n")
    
  8. Once Everthing is Setup then run the following command

         python main.py
    

Once Done Try Asking Questions Related To Your PDF

Example Output:

πŸŽ‰Congrats 🎊

You’ve just built a full RAG pipeline that brings your static PDFs to life using cutting-edge AI:

βœ… LangChain to glue everything

βœ… Qdrant for fast semantic search

βœ… Gemini for smart responses

βœ… All running locally on your Windows machine

This setup works for cheat sheets, textbooks, resumes, reports etc.

πŸ”— Let’s Connect

Enjoyed this tutorial? Share it, fork it, and follow for more on LangChain, LLMs, and full-stack AI.

Happy Building! πŸ› οΈ

Here Is Github Repository For the code: https://github.com/Vedu-07/Generative_AI

0
Subscribe to my newsletter

Read articles from Vedant Swami directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Vedant Swami
Vedant Swami