Basic RAG for PDF chat - Short & Crisp


Overview
So, RAG stands for Retrieval‑Augmented Generation. Basically, it’s a process of injecting external data into a prompt to get the desired response. External data could come from a database, the web, API calls, local files, etc.
What to do
Injection Phase
Accept user pdf
Make vector Embeddings
Store it in a vector DB
Retrieval Phase
Accept user query
Make vector Embeddings
Search in vector DB and Retrieve Relevant Chunks
Now, based on this relevant chunks + user query, we ask the LLM, and return its response back to user. That’s a basic typical RAG. You can choose any embeddings model, LLM’s and vector stores you prefer.
How to do
Loader
from langchain_community.document_loaders import PyPDFLoader
file_path = "./public/file_nane.pdf"
loader = PyPDFLoader(file_path)
docs = loader.load()
Chunking
from langchain_text_splitters import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
split_docs = text_splitter.split_documents(loader.load())
Make vector Embeddings & Store it in a vector DB
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_qdrant import QdrantVectorStore
embeddings = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")
qdrant = QdrantVectorStore.from_documents(
documents=[], # empty docoment for first time. It will fill the embeddings here
embedding=embeddings,
collection_name="learning-genai",
url=os.getenv("QDRANT_URL"),
api_key=os.getenv("QDRANT_API_KEY"),
)
qdrant.add_documents(documents=split_docs)
retriver = QdrantVectorStore.from_existing_collection(
embedding=embeddings,
collection_name="learning-genai",
url=os.getenv("QDRANT_URL"),
api_key=os.getenv("QDRANT_API_KEY"),
)
Search in vector DB and Retrieve Relevant Chunks
from langchain_groq import ChatGroq
from langchain_core.prompts import ChatPromptTemplate
user_query = "take the user query as input"
relevent_chunks = retriver.similarity_search(query = user_query)
# for res in relevent_chunks:
# print(f"{res.page_content} [{res.metadata}]")
llm = ChatGroq(
model="llama-3.1-8b-instant",
temperature=0,
)
prompt_template = ChatPromptTemplate([
("system", "you are a helpful AI asistant who responds based on the availabe context. {context}"),
("user", "give a short and crisp answer of this query {query}")
])
prompt = prompt_template.invoke({"context": relevent_chunks, "query": user_query})
response = llm.invoke(prompt)
print(response.content)
Subscribe to my newsletter
Read articles from Satyajit Patel directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Satyajit Patel
Satyajit Patel
Code Enthusiast | Curios to meet the unmet technologies