Build a PDF Chatbot Using RAG, LangChain, Qdrant & Gemini on Windows

Have you ever wanted to chat with your PDFs like they’re your personal assistants? Whether you're skimming cheat sheets, research papers, or documentation — this blog shows you how to build an intelligent chatbot that answers questions from any PDF using Retrieval-Augmented Generation (RAG).

This is your complete guide to setting it up on Windows, with LangChain, Qdrant, and Google Gemini — all in under 100 lines of Python!

🚀 What You’ll Build

A terminal-based chatbot where you can ask:

What is React?
Summarize the useEffect hook.

And get intelligent, accurate responses from the actual content of the PDF — thanks to RAG.

📚 What is RAG (Retrieval-Augmented Generation)?

RAG stands for:

Retrieval – Search the most relevant chunks from your PDF using a vector database.
Augmented – Inject those chunks into the prompt.
Generation – Let a language model generate context-aware responses based on those chunks.

This architecture helps LLMs like Gemini give accurate answers without hallucinating.

🧠 The Two Main Phases

🔍 Indexing Phase (One-Time)

Load PDF
Split it into manageable chunks
Embed chunks into vector format
Store them in Qdrant vector DB

💬 Retrieval Phase (Each Query)

Accept a question
Search Qdrant for similar chunks
Combine them with your query
Let Gemini generate a response

🧱 Project Structure

pdf-chat-rag/
├── .venv/                     # Python virtual environment
├── docker-compose.yml         # Qdrant Database config
├── requirements.txt           # Python dependencies
├── main.py                    # Core app logic
├── pdfs/
│   └── React CheatSheet.pdf   # Add The PDF You want to chat to
└── .env                       # Store Your Google API Key Here

🔧 Step-by-Step Setup

✅ Step 1: Create Project & Activate Virtual Env

python -m venv .venv
.venv\\Scripts\\activate

✅ Step 2: Install Dependencies

Create requirements.txt and add all the dependencies

langchain
langchain-community
langchain-google-genai
qdrant-client
python-dotenv
pypdf

Run the following command to install all dependencies

pip install -r requirements.txt

✅ Step 3: Add Your API Key

Create a .env file:

GOOGLE_API_KEY=your_google_genai_api_key

✅ Step 4: Start Qdrant with Docker

Create docker-compose.yml file and add Qdrant Config

services:
  qdrant:
    image: qdrant/qdrant
    ports:
      - "6333:6333"

Open a new Terminal & Start Qdrant:

docker-compose up

✅ Step 5: Add a PDF

Put your file in the pdfs/ folder:

pdfs/React CheatSheet.pdf

✅ Step 6: Code the App — `main.py`

Import the dependencies

 from pathlib import Path
 from dotenv import load_dotenv
 import os

 from langchain_community.document_loaders import PyPDFLoader
 from langchain_text_splitters import RecursiveCharacterTextSplitter
 from langchain_google_genai import GoogleGenerativeAIEmbeddings, ChatGoogleGenerativeAI
 from langchain_qdrant import QdrantVectorStore
 from langchain_core.prompts import PromptTemplate

 load_dotenv()
 GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")

Load The PDF & Split it into chunks

 pdf_path = Path("pdfs/React CheatSheet.pdf")
 loader = PyPDFLoader(file_path=pdf_path)
 docs = loader.load()

 text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=200)
 split_docs = text_splitter.split_documents(docs)

Create Vector Embeddings

 embedder = GoogleGenerativeAIEmbeddings(
     model="models/text-embedding-004",
     google_api_key=GOOGLE_API_KEY,
 )

Store It In Database

 vector_store = QdrantVectorStore.from_documents(
     documents=split_docs,
     embedding=embedder,
     url="<http://localhost:6333>",
     collection_name="pdf_rag_chat"
 )

Set Up LLM

 llm = ChatGoogleGenerativeAI(
     model="gemini-2.0-flash",
     google_api_key=GOOGLE_API_KEY
 )

Define the SYSTEM PROMPT

 SYSTEM_PROMPT = """
 You are a smart PDF assistant. Answer user queries using only the provided PDF excerpts.

 - For **summaries**, give a brief overview of key points.
 - For **specific questions**, extract and present relevant info directly.
 - For **explanations**, start with a simple overview, then add detail if needed.
 - If the info isn't in the excerpts, reply: "The PDF does not contain this information."

 Be clear, concise, and avoid unnecessary jargon. Structure your answers to match the user's intent.
 If query is unclear, ask user to clarify the question once again
 """

 prompt_template = PromptTemplate(
     input_variables=["query", "excerpts"],
     template=SYSTEM_PROMPT +
              "\\n\\nUser Query: {query}\\n\\nRelevant PDF Excerpts:\\n{excerpts}\\n\\nAssistant:"
 )

Create an Interactive Chat-Loop

 while True:
     query = input("Ask a question about the PDF (or type 'exit' to quit): ")
     if query.lower() in ["exit", "quit"]:
         print("Goodbye!")
         break

     docs = vector_store.similarity_search(query, k=3)
     context = "\\n\\n---\\n\\n".join([doc.page_content for doc in docs])
     full_prompt = prompt_template.format(query=query, excerpts=context)

     response = llm.invoke(full_prompt)
     print("\\nAssistant:", response.content, "\\n")

Once Everthing is Setup then run the following command
```
     python main.py
```

Once Done Try Asking Questions Related To Your PDF

Example Output:

🎉Congrats 🎊

You’ve just built a full RAG pipeline that brings your static PDFs to life using cutting-edge AI:

✅ LangChain to glue everything

✅ Qdrant for fast semantic search

✅ Gemini for smart responses

✅ All running locally on your Windows machine

This setup works for cheat sheets, textbooks, resumes, reports etc.

🔗 Let’s Connect

Enjoyed this tutorial? Share it, fork it, and follow for more on LangChain, LLMs, and full-stack AI.

Happy Building! 🛠️

Here Is Github Repository For the code: 👉 View Full Code on GitHub

RAG Demystified: Create your own Simple RAG

Table of contents

🚀 What You’ll Build

📚 What is RAG (Retrieval-Augmented Generation)?

🧠 The Two Main Phases

🔍 Indexing Phase (One-Time)

💬 Retrieval Phase (Each Query)

🧱 Project Structure

🔧 Step-by-Step Setup

✅ Step 1: Create Project & Activate Virtual Env

✅ Step 2: Install Dependencies

✅ Step 3: Add Your API Key

✅ Step 4: Start Qdrant with Docker

✅ Step 5: Add a PDF

✅ Step 6: Code the App — `main.py`

🎉Congrats 🎊

🔗 Let’s Connect

Subscribe to my newsletter

Vedant Swami

Vedant Swami

RAG Demystified: Create your own Simple RAG

Table of contents

🚀 What You’ll Build

📚 What is RAG (Retrieval-Augmented Generation)?

🧠 The Two Main Phases

🔍 Indexing Phase (One-Time)

💬 Retrieval Phase (Each Query)

🧱 Project Structure

🔧 Step-by-Step Setup

✅ Step 1: Create Project & Activate Virtual Env

✅ Step 2: Install Dependencies

✅ Step 3: Add Your API Key

✅ Step 4: Start Qdrant with Docker

✅ Step 5: Add a PDF

✅ Step 6: Code the App — main.py

🎉Congrats 🎊

🔗 Let’s Connect

Subscribe to my newsletter

Vedant Swami

Vedant Swami

✅ Step 6: Code the App — `main.py`