๐ Building RAG App with OpenAI, LangChain & Qdrant: PDF Q&A Powered by GPT-4


Recently, I built a RAG (Retrieval Augmented Generation) application that allows users to query a PDF document (in my case, a Node.js PDF) using natural language, and receive page-specific, context-aware answers powered by OpenAI's GPT-4. The tech stack includes LangChain, Qdrant, and Python, and the project is entirely local and open-source friendly.
โ๏ธ How It Works
1. ๐ PDF Loading & Chunking (indexing.py
)
We use PyPDFLoader
from langchain_community
to load the PDF, then chunk it into manageable sections using RecursiveCharacterTextSplitter
.
loader = PyPDFLoader(file_path=pdf_path)
docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
split_docs = text_splitter.split_documents(docs)
2. ๐ง Embedding with OpenAI + Storing in Qdrant
We use OpenAIEmbeddings
(model: text-embedding-3-large
) and store the resulting vectors in Qdrant, a high-performance vector database running locally on port 6333.
embedding_model = OpenAIEmbeddings(model="text-embedding-3-large")
vector_store = QdrantVectorStore.from_documents(
documents=split_docs,
url="http://localhost:6333",
collection_name="rag-app-vectors",
embedding=embedding_model
)
โ Once done, all PDF content is chunked, embedded, and indexed!
๐ค Query Time (chat.py
)
Users can now input natural language queries, and we do a vector similarity search using LangChain's integration with Qdrant.
query = input("> ")
search_result = vector_db.similarity_search(query=query)
The retrieved context is injected into the system prompt that instructs GPT-4 to answer the query based only on the retrieved chunks and direct the user to the relevant PDF page.
SYSTEM_PROMPT = f"""
You are a helpful AI assistant...
Context:
{context}
"""
Then we run the final prompt using openai.ChatCompletion
:
chat_completion = client.chat.completions.create(
model="gpt-4.1",
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": query}
]
)
๐ Tech Stack
Python
LangChain
OpenAI API (GPT-4.1 + text-embedding-3-large)
Qdrant Vector DB
PyPDFLoader
.env for API key management
๐ Bonus: Secure with .env
I kept all credentials and keys inside an .env
file and used load_dotenv()
to manage them securely.
๐งช Testing It Out
Sample Query:
> What is event-driven programming in Node.js?
GPT-4 responds with a summarized answer from the relevant chunk and mentions the exact page number to look at in the PDF. Super helpful!
๐ง Why I Built This?
I wanted a way to interact with technical PDFs like cheat sheets or documentation in a natural, conversational way. Instead of manually reading a 100-page PDF, now I can just ask questions and get the answer with source references.
๐ฆ Next Steps
Add Streamlit or Next.js UI
Add file upload support
Add source highlighting
Deploy Qdrant in Docker (already added
docker-compose.yml
)
๐ฌ Want to Try This?
Let me know in the comments if you'd like the GitHub repo โ happy to share and collaborate!
#RAG #LangChain #Qdrant #GPT4 #OpenAI #VectorSearch #Python #AIApps #LLM #FullstackAI
Subscribe to my newsletter
Read articles from Robin Roy directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
