Query Transformation - RAG series


Hey everyone, welcome to the Gen AI series.
I hope you have enjoyed the previous blog where we discussed step by step how to chat to our PDF using RAG approach.
We will be continuing that code and modifying it to get more accurate results.
Introduction to Query Transformation
The entire game is of getting relevant answer to the user, getting what the user not just ‘asked‘, but what the user intends to ask.
We are already aware of Google search, do we get only what we have searched for, or all the things related and relevant to that query?
User’s query is not accurate enough to get what he actually needs
It has to be polished enough to get that specific thing from the AI
User’s Query - can be more abstract and can be less abstract as well
Ek example -
Let’s say aap ki ek dukaan hai hardware ki, waha pe sirf aap ku pata hai ki konsi chiz kidhar hai and kya naam hai etc etc
Ab aap ko zarurat padgayi ki ek ladke ko dukaan par rakhe - gaahak bahut zyada hai, dukaan badi hai
Ab usko kuch kuch chize malum hai, lekin experience dukaan ka nahi hai
Isliye wo pehle din se sahi kaam nahi karega, galtiya karega, ek maango koi aur chiz laayega,
isko pehle se hi samjhana hoga ki gaahak kya kahega - gaahak shyd kuch aur puche, gaahak ko shyd na maloom ho us chiz ke bare me, gaahak galti se kuch aur puch le
Ab aapko ye sab train karna hoga so that apka ladka jab aap na ho dukaan par, jab bhi wo usko sambhal sake
Aise hi RAGs ka masla hai, they are intelligent, but still we need to polish them and make them reiterate on what user has asked and what to respond to him…
Parallel Query - Fan out retrieval
Let’s explore how this approach works
So, here we will be using langchain package
First, we have to import Multi Query Retriever
from langchain.retrievers.multi_query import MultiQueryRetriever
from langchain_openai import ChatOpenAI
Then, we need to call this Multi Query retriever on the regular retriever we have
Before that, we need multiple queries generated from the user query, so we are using ChatOpenAI model to perform that task
llm = ChatOpenAI(
model_name="gpt-4o-mini",
temperature=0,
openai_api_key=os.environ.get("OPENAI_API_KEY")
)
Next, we will be using multi query retriever to do multiple similarity searches and get relevant chunks from them and then we can derive the context for our AI model
retriever = QdrantVectorStore.from_existing_collection(
url="http://localhost:6333",
collection_name="langchain_learning",
embedding=embedder
)
multi_query_retriever = MultiQueryRetriever.from_llm(
retriever=retriever.as_retriever(),
llm=llm
)
Final step is to set the context and log our output, here it internally selects unique chunks from multiple results and returns that
relevant_docs = multi_query_retriever.invoke(user_query)
SYSTEM_PROMPT = f"""
You are a helpful AI assistant who has access to a specific document of user,
and user will ask questions from it, answer only those,
i have given you context from where you have to answer it
context={relevant_docs}
"""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_query},
]
)
print(response.choices[0].message.content)
The output i got for the question “Explain the major challenges faced during AI model deployment.“
The major challenges faced during AI model deployment include:
Adjusting Workflows: Developers need to adapt their existing workflows, prompts, and data to work with the new models, which can have unique quirks, strengths, and weaknesses.
Versioning and Evaluation Infrastructure: Without proper infrastructure in place for versioning and monitoring the performance of the models, deployment can lead to numerous complications and operational headaches.
Regulatory Changes: Regulations surrounding AI technologies are constantly evolving. For example, AI resources can be heavily regulated as national security issues, and compliance with regulations such as the GDPR can be costly and complex.
Compute Resource Availability: Changes in laws can suddenly limit access to compute resources, such as being banned from purchasing GPUs from certain vendors, impacting the ability to deploy models effectively.
Intellectual Property Concerns: There are uncertainties regarding intellectual property when utilizing models trained on data that may not be owned by the developer. This can create hesitancies, especially for companies deeply invested in their IP.
These challenges highlight the importance of thorough planning and consideration of the evolving landscape of AI deployment.
Stay tuned, as we will discuss more approaches in Query Transformation in our upcoming blogs
Reciprocate Rank Fusion
Query Decomposition
Well, this is the end of one of the concepts of Advanced Rag approaches - Parallel Query Retrieval.
See you in the next blog
Let’s connect here on Twitter
Peace out ✌️
Subscribe to my newsletter
Read articles from Mubashir Ahmed directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Mubashir Ahmed
Mubashir Ahmed
I am a front end web developer, learning React. I am right now focusing on JavaScript and completing #30daysOfJs challenge by writing a blog daily where I'm sharing my learnings.