Build a RAG App Using Python, Langchain, Qdrant & OpenAI

Recently, I built a RAG (Retrieval Augmented Generation) application that allows users to query a PDF document (in my case, a Node.js PDF) using natural language, and receive page-specific, context-aware answers powered by OpenAI's GPT-4. The tech stack includes LangChain, Qdrant, and Python, and the project is entirely local and open-source friendly.

⚙️ How It Works

1. 📄 PDF Loading & Chunking (`indexing.py`)

We use PyPDFLoader from langchain_community to load the PDF, then chunk it into manageable sections using RecursiveCharacterTextSplitter.

loader = PyPDFLoader(file_path=pdf_path)
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
split_docs = text_splitter.split_documents(docs)

2. 🧠 Embedding with OpenAI + Storing in Qdrant

We use OpenAIEmbeddings (model: text-embedding-3-large) and store the resulting vectors in Qdrant, a high-performance vector database running locally on port 6333.

embedding_model = OpenAIEmbeddings(model="text-embedding-3-large")

vector_store = QdrantVectorStore.from_documents(
    documents=split_docs,
    url="http://localhost:6333",
    collection_name="rag-app-vectors",
    embedding=embedding_model
)

✅ Once done, all PDF content is chunked, embedded, and indexed!

🤖 Query Time (`chat.py`)

Users can now input natural language queries, and we do a vector similarity search using LangChain's integration with Qdrant.

query = input("> ")
search_result = vector_db.similarity_search(query=query)

The retrieved context is injected into the system prompt that instructs GPT-4 to answer the query based only on the retrieved chunks and direct the user to the relevant PDF page.

SYSTEM_PROMPT = f"""
    You are a helpful AI assistant...
    Context:
    {context}
"""

Then we run the final prompt using openai.ChatCompletion:

chat_completion = client.chat.completions.create(
    model="gpt-4.1",
    messages=[
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": query}
    ]
)

🛠 Tech Stack

Python
LangChain
OpenAI API (GPT-4.1 + text-embedding-3-large)
Qdrant Vector DB
PyPDFLoader
.env for API key management

🔐 Bonus: Secure with `.env`

I kept all credentials and keys inside an .env file and used load_dotenv() to manage them securely.

🧪 Testing It Out

Sample Query:

> What is event-driven programming in Node.js?

GPT-4 responds with a summarized answer from the relevant chunk and mentions the exact page number to look at in the PDF. Super helpful!

🧠 Why I Built This?

I wanted a way to interact with technical PDFs like cheat sheets or documentation in a natural, conversational way. Instead of manually reading a 100-page PDF, now I can just ask questions and get the answer with source references.

📦 Next Steps

Add Streamlit or Next.js UI
Add file upload support
Add source highlighting
Deploy Qdrant in Docker (already added docker-compose.yml)

💬 Want to Try This?

Let me know in the comments if you'd like the GitHub repo — happy to share and collaborate!

#RAG #LangChain #Qdrant #GPT4 #OpenAI #VectorSearch #Python #AIApps #LLM #FullstackAI

🚀 Building RAG App with OpenAI, LangChain & Qdrant: PDF Q&A Powered by GPT-4

⚙️ How It Works

1. 📄 PDF Loading & Chunking (`indexing.py`)

2. 🧠 Embedding with OpenAI + Storing in Qdrant

🤖 Query Time (`chat.py`)

🛠 Tech Stack

🔐 Bonus: Secure with `.env`

🧪 Testing It Out

🧠 Why I Built This?

📦 Next Steps

💬 Want to Try This?

Subscribe to my newsletter

Robin Roy

Robin Roy

🚀 Building RAG App with OpenAI, LangChain & Qdrant: PDF Q&A Powered by GPT-4

⚙️ How It Works

1. 📄 PDF Loading & Chunking (indexing.py)

2. 🧠 Embedding with OpenAI + Storing in Qdrant

🤖 Query Time (chat.py)

🛠 Tech Stack

🔐 Bonus: Secure with .env

🧪 Testing It Out

🧠 Why I Built This?

📦 Next Steps

💬 Want to Try This?

Subscribe to my newsletter

Robin Roy

Robin Roy

1. 📄 PDF Loading & Chunking (`indexing.py`)

🤖 Query Time (`chat.py`)

🔐 Bonus: Secure with `.env`