Before Getting Dive into the world of Query Transformation patterns, Let’s Understand What is Query Transformation.

Query Transformation :

Query Transformation is a way to enhance the user’s query or question so that we can enhance the RAG (Retrieval-Augmented Generation) Systems accuracy.

Purpose of Query Transformation is to get more semantic meaning in vector 3D space for enhancing the accuracy of RAG.

Here is the difference of context before query transformation and after query transformation.

Now you know the Query Transformation what is and why it is important.

Let’s Dive Deep into the patterns of Query Transformation.

Parallel Query Retrieval ( Fan Out )

Parallel Query Retrieval is a technique to breaking down the user’s query or question into different sub queries so that we can enhance the accuracy of RAG (Retrieval-Augmented Generation) systems.

Suppose we have a data source a PDF of Investing and after uploading the PDF to the AI we are asking this query “How to Start Investing in Stocks”, so before the parallel Query Retrieval RAG resolved this query using this process

After the Parallel Query Retrieval Only Retrieval Process will be change

Here we can see After rewrite the user query we will have 3 different types of sub-queries if we perform vector embedding and similarity search on every sub-query so we get more context using these query

Prompt for Rewrite the Query

You are a helpful AI Assistant. 
Your task is to take the user query and
break down the user query into different sub-quries.

Rule:
minimun Sub Query Length :- 3
maximum Sub Query Length :- 5

Example: 
Query: How to become GenAI Developer?
Output: [
    'How to become GenAI Developer?',
    'What is GenAI?',
    'What is Developer?',
    'What is GenAI Developer?'
    'Steps to become GenAI Developer.'
]

Full Code - Gemini AI ( INDEXING , RETRIEVAL and GENERATION )

from pathlib import Path # File Path
from langchain_community.document_loaders import PyPDFLoader # Loader
from langchain_text_splitters import RecursiveCharacterTextSplitter # Text Splitter
from langchain_google_genai import GoogleGenerativeAIEmbeddings # Google Embedding
from langchain_qdrant import QdrantVectorStore # Vector Store

# GOGGLE GENERATIVE AI
from google import genai
from google.genai import types

from concurrent.futures import ThreadPoolExecutor #Multithreading
from itertools import chain #Flatten
import ast #Parsing


# === CONFIGURATION ===

# Initialize the Gemini client with your API key
genai_client = genai.Client(api_key='GOOGLE_GEMINI_API')

# Google Generative AI Embeddings
embeddings = GoogleGenerativeAIEmbeddings(
    model="models/text-embedding-004", 
    google_api_key="GOOGLE_GEMINI_API"
)

# === INDEXING PART ===

# Data Source - PDF
pdf_path = Path(__file__).parent / "file_path.pdf"

# Load the document from the PDF file
loader = PyPDFLoader(file_path=pdf_path)
docs = loader.load()

# Split the document into smaller chunks, Adjust chunk_size and chunk_overlap 
# according to your need
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
split_docs = text_splitter.split_documents(documents=docs)

# Create a vector store - if collection exists
# vector_store = QdrantVectorStore.from_existing_collection(
#   url="http://localhost:6333",
#   collection_name="collection_name",
#   embedding=embeddings
# )

# Create a new vector store - if collection doesn't already exist
vector_store = QdrantVectorStore.from_documents(
  documents=[],
  url="http://localhost:6333",
  collection_name="collection_name", # Name of your collection in Qdrant
  embedding=embeddings
)

# Add the documents to the vector store
vector_store.add_documents(split_docs)

# === RETRIEVAL PART ===

retriever = QdrantVectorStore.from_existing_collection(
  url="http://localhost:6333",
  collection_name="collection_name", # Name of your collection in Qdrant
  embedding=embeddings
)

user_query = "What is FS Module?" # User Query

# === SUB-QUERY EXTRACTION USING GEMINI ===

# System prompt for breaking down the user's query into sub-queries
system_prompt_for_subqueries = """
You are a helpful AI Assistant. 
Your task is to take the user query and break it down into different sub-queries.

Rule:
Minimum Sub Query Length :- 3
Maximum Sub Query Length :- 5

Example:
Query: How to become GenAI Developer?
Output: [
    "How to become GenAI Developer?",
    "What is GenAI?",
    "What is Developer?",
    "What is GenAI Developer?",
    "Steps to become GenAI Developer."
]
"""
# Call Gemini API to break down the user's query into sub-queries
breakdown_response = genai_client.models.generate_content(
    model='gemini-2.0-flash-001',
    contents=f"Query: {user_query}",
    config=types.GenerateContentConfig(system_instruction=system_prompt_for_subqueries)
)

# Convert the Gemini response to a Python list (parse the output safely)
sub_queries = ast.literal_eval(breakdown_response.text.strip())
print("Sub Queries:", sub_queries)

# === PARALLEL VECTOR RETRIEVAL ===

# Function to retrieve relevant document chunks for each sub-query
def retrieve_chunks(query):
    return retriever.similarity_search(query=query)

# Use ThreadPoolExecutor to perform parallel retrieval of chunks for each sub-query
with ThreadPoolExecutor() as executor:
    all_chunks = list(executor.map(retrieve_chunks, sub_queries))

# Flatten the list of results (if there are multiple chunks per sub-query)
flattened_chunks = list(chain.from_iterable(all_chunks))

# Optionally remove duplicate chunks (based on content)
unique_chunks = list({doc.page_content: doc for doc in flattened_chunks}.values())

# === Generation Part ===

# Prepare the final system prompt with the unique relevant document chunks
final_system_prompt = f"""
You are a helpful assistant who answers the user's query using the following pieces of context.
If you don't know the answer, just say you don't know — don't make up an answer.

Context:
{[doc.page_content for doc in unique_chunks]}
"""

# Send the final request to Gemini for generating the response using the relevant context
final_response = genai_client.models.generate_content(
    model='gemini-2.0-flash-001',
    contents=user_query,  # The original user query
    config=types.GenerateContentConfig(system_instruction=final_system_prompt)
)

# Output the final response
print("\nFinal Answer:\n")
print(final_response.text)

# Packages for GeminiAI
pip install langchain-community
pip install pypdf
pip install langchain-google-genai
pip install qdrant-client
pip install langchain-qdrant
pip install google-genai

# Replace this in the code 
GOOGLE_GEMINI_API = your api key
collection_name = Qdrant collection name
file_path = your pdf file path
user_query = your query

NOTE:- Make Sure your Qdrant is running at localhost:6333 through docker or manually when running this code.

In this technique one problem is that after filter unique chunks, we are sending chunks to the AI or RAG Systems in random order but what happened if one chunk is more important because its comes every time and we are given this chunk at the last.

so we have to priorities the chunks. The solution of this problem into the next technique

Reciprocal Rank Fusion

Reciprocal Rank Fusion is a technique where we rank the list of chunks or documents that comes frequently and High up in the list.

Rank Fusion Code Function: (Ranking Chunks)

# Rank the Documents (Chunks) 
def reciprocal_rank_fusion(rankings, k=60):
    scores = {}

    for ranking in rankings:
        for rank, doc_id in enumerate(ranking):
            scores[doc_id] = scores.get(doc_id, 0) + 1 / (k + rank + 1)

    sorted_docs = sorted(scores.items(), key=lambda x: x[1], reverse=True)
    return [doc_id for doc_id, score in sorted_docs]

# Testing Only
rankings = [['chunk 1', 'chunk 2'], ['chunk 2', 'chunk 3'], ['chunk 1', 'chunk 3']]
results = reciprocal_rank_fusion(rankings)
print(results)  # Output: ['chunk 1', 'chunk 2', 'chunk 3']

Full Code - Gemini AI ( INDEXING , RETRIEVAL and GENERATION )

from pathlib import Path # File Path
from langchain_community.document_loaders import PyPDFLoader # Loader
from langchain_text_splitters import RecursiveCharacterTextSplitter # Text Splitter
from langchain_google_genai import GoogleGenerativeAIEmbeddings # Google Embedding
from langchain_qdrant import QdrantVectorStore # Vector Store

# GOGGLE GENERATIVE AI
from google import genai
from google.genai import types

from concurrent.futures import ThreadPoolExecutor #Multithreading
from itertools import chain #Flatten
import ast #Parsing


# === CONFIGURATION ===

# Initialize the Gemini client with your API key
genai_client = genai.Client(api_key='GOOGLE_GEMINI_API')

# Google Generative AI Embeddings
embeddings = GoogleGenerativeAIEmbeddings(
    model="models/text-embedding-004", 
    google_api_key="GOOGLE_GEMINI_API"
)

# === INDEXING PART ===

# Data Source - PDF
pdf_path = Path(__file__).parent / "file_path.pdf"

# Load the document from the PDF file
loader = PyPDFLoader(file_path=pdf_path)
docs = loader.load()

# Split the document into smaller chunks, Adjust chunk_size and chunk_overlap 
# according to your need
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
split_docs = text_splitter.split_documents(documents=docs)

# Create a vector store - if collection exists
# vector_store = QdrantVectorStore.from_existing_collection(
#   url="http://localhost:6333",
#   collection_name="collection_name",
#   embedding=embeddings
# )

# Create a new vector store - if collection doesn't already exist
vector_store = QdrantVectorStore.from_documents(
  documents=[],
  url="http://localhost:6333",
  collection_name="collection_name", # Name of your collection in Qdrant
  embedding=embeddings
)

# Add the documents to the vector store
vector_store.add_documents(split_docs)

# === RETRIEVAL PART ===

retriever = QdrantVectorStore.from_existing_collection(
  url="http://localhost:6333",
  collection_name="collection_name", # Name of your collection in Qdrant
  embedding=embeddings
)

user_query = "What is FS Module?" # User Query

# === SUB-QUERY EXTRACTION USING GEMINI ===

# System prompt for breaking down the user's query into sub-queries
system_prompt_for_subqueries = """
You are a helpful AI Assistant. 
Your task is to take the user query and break it down into different sub-queries.

Rule:
Minimum Sub Query Length :- 3
Maximum Sub Query Length :- 5

Example:
Query: How to become GenAI Developer?
Output: [
    "How to become GenAI Developer?",
    "What is GenAI?",
    "What is Developer?",
    "What is GenAI Developer?",
    "Steps to become GenAI Developer."
]
"""
# Call Gemini API to break down the user's query into sub-queries
breakdown_response = genai_client.models.generate_content(
    model='gemini-2.0-flash-001',
    contents=f"Query: {user_query}",
    config=types.GenerateContentConfig(system_instruction=system_prompt_for_subqueries)
)

# Convert the Gemini response to a Python list (parse the output safely)
sub_queries = ast.literal_eval(breakdown_response.text.strip())
print("Sub Queries:", sub_queries)

# === Reciprocal Rank Fusion ===

# Function to retrieve relevant document chunks for each sub-query
def retrieve_chunks(query):
    return retriever.similarity_search(query=query)

# Use ThreadPoolExecutor to perform parallel retrieval of chunks for each sub-query
with ThreadPoolExecutor() as executor:
    all_chunks = list(executor.map(retrieve_chunks, sub_queries))

# Helper to generate a unique ID for each chunk (or you can use doc.metadata['id'] if available)
def get_doc_id(doc):
    return doc.page_content.strip()[:50]  # Use first 50 characters as an ID

# Create rankings (lists of doc_ids per sub-query result)
rankings = []
for result in all_chunks:
    rankings.append([get_doc_id(doc) for doc in result])

# Reciprocal Rank Fusion
def reciprocal_rank_fusion(rankings, k=60):
    scores = {}
    for ranking in rankings:
        for rank, doc_id in enumerate(ranking):
            scores[doc_id] = scores.get(doc_id, 0) + 1 / (k + rank + 1)
    sorted_docs = sorted(scores.items(), key=lambda x: x[1], reverse=True)
    return [doc_id for doc_id, _ in sorted_docs]

# Get final ranked doc IDs
final_doc_ids = reciprocal_rank_fusion(rankings)

# Map doc IDs to actual chunks
doc_map = {get_doc_id(doc): doc for doc in chain.from_iterable(all_chunks)}
ranked_chunks = [doc_map[doc_id] for doc_id in final_doc_ids if doc_id in doc_map]

# === GENERATION PART ===

# Prepare the final system prompt with the top-ranked chunks
final_system_prompt = f"""
You are a helpful assistant who answers the user's query using the following pieces of context.
If you don't know the answer, just say you don't know — don't make up an answer.

Context:
{[doc.page_content for doc in ranked_chunks]}
"""

# Final call to Gemini using top-ranked documents
final_response = genai_client.models.generate_content(
    model='gemini-2.0-flash-001',
    contents=user_query,
    config=types.GenerateContentConfig(system_instruction=final_system_prompt)
)

# Output the final answer
print("\nFinal Answer:\n")
print(final_response.text)

Simply use this Rank fusion Technique for Increase More Accuracy of RAG Systems and get more accurate results.

Step Back Prompting

Step-back prompting is a technique to improve RAG’s reasoning by breaking complex question.

we have follow two steps

Step Back - First, ask the model to convert original question of user into conceptual question so that we hover more context.
Answer - Use this Step back question to guide a more accurate and generalizable response.

Example:

Original Question : Jan Sindel’s was born in what country?

Step Back Question : what is Jan Sindel’s personal history?

White paper :- Take a Step Back ( Google Deep Mind )

Prompt for Rewrite the Query

You are a helpful AI Assistant. 
Your task is to take the user's original query and convert to the 
conceptual question.

Example: 
Query: Which year Mahatama Gandi Born?
Step-Back: What is Mahatama Gandi personal History?

Query: Which skills is required to become software Developer?
Step-Back: How to become Software Developer?

In the Parallel Query Retrieval and Reciprocal Rank Fusion we have multiple query but here we have one query to get more context in the retrieval part.

Full Code - Gemini AI ( INDEXING , RETRIEVAL and GENERATION )

from pathlib import Path
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
# from langchain_openai import OpenAIEmbeddings
from langchain_google_genai import GoogleGenerativeAIEmbeddings

# Vector Store 
from langchain_qdrant import QdrantVectorStore

# GOGGLE GENERATIVE AI
from google import genai
from google.genai import types

# === CONFIGURATION ===

# Initialize the Gemini client with your API key
genai_client = genai.Client(api_key='GOOGLE_GEMINI_API')

# Google Generative AI Embeddings
embeddings = GoogleGenerativeAIEmbeddings(
    model="models/text-embedding-004", 
    google_api_key="GOOGLE_GEMINI_API"
)

# === INDEXING PART ===

# Data Source - PDF
pdf_path = Path(__file__).parent / "file_path.pdf"

# Load the document from the PDF file
loader = PyPDFLoader(file_path=pdf_path)
docs = loader.load()

# Split the document into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
split_docs = text_splitter.split_documents(documents=docs)


# Google Generative AI Embeddings
embeddings = GoogleGenerativeAIEmbeddings(
  model="models/text-embedding-004", 
  google_api_key = "GOOGLE_GEMINI_API"
)

# create a vector store - if collection does not exist
vector_store = QdrantVectorStore.from_documents(
  documents=[],
  url="http://localhost:6333",
  collection_name="collection_name", # Name of your collection in Qdrant
  embedding=embeddings
)

# Add the documents to the vector store
vector_store.add_documents(split_docs)

# Retrieval part
retriever = QdrantVectorStore.from_existing_collection(
  url="http://localhost:6333",
  collection_name="collection_name", # Name of your collection in Qdrant
  embedding=embeddings
)

user_query = "What is Node.js and how does it work?"

hypothetical_prompt = f"""
You are a helpful assistant who answer user's query.

Example:

Query: What is Node.js and how does it work?
Answer: Node.js is a JavaScript runtime environment that allows you to execute JavaScript code outside of a web browser. It is used for server-side development and building APIs.
"""

response = genai_client.models.generate_content(
    model="gemini-2.0-flash-001",
    contents=f"Query: {user_query}",
    config=types.GenerateContentConfig(system_instruction=hypothetical_prompt)
)

hypothetic_answer = response.text

relevant_chunks = retriever.similarity_search(
  query=hypothetic_answer,
)

# Generation part

final_system_prompt = f"""
You are a helpful assistant who answer user's query by using the following pieces of context.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

context: {relevant_chunks}
"""

final_response = genai_client.models.generate_content(
    model="gemini-2.0-flash-001",
    contents=user_query,
    config=types.GenerateContentConfig(system_instruction=final_system_prompt)
)

print("\nFinal Answer:\n")
print(final_response.text)

COT - Chain of Thought Prompting

Chain of Thought Prompting is a technique where we plan step by step before answering user’s query. Here we process one sub query at a time.

Prompt for Rewrite the query

You are helpful AI Assistant. 
Your task is to create a step by step plan and think how to answer the user's Query
and provide the output steps in JSON format. Last step should be the user query.

Rule:
1. Follow the strict JSON output as per Output schema.
2. Always perform one step at a time and wait for next input
3. Carefully analyse the user query
4. Do not repeat the same step
5. Perform Maximum 4 steps

Example: 
Query: What is FS Module in NodeJS?
Output: {{ step: "thinking", content: "What is FS?" }}
Output: {{ step: "thinking", content: "What is Module?" }}
Output: {{ step: "thinking", content: "What is NodeJS?" }}
Output: {{ step: "thinking", content: "What is FS Module in Nodejs?" }}

In CoT-Prompting, steps can be increase and decrease based on user’s query.

Full Code - Gemini AI ( INDEXING , RETRIEVAL and GENERATION )

from pathlib import Path
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
# from langchain_openai import OpenAIEmbeddings
from langchain_google_genai import GoogleGenerativeAIEmbeddings

# Vector Store 
from langchain_qdrant import QdrantVectorStore

# GOGGLE GENERATIVE AI
from google import genai
from google.genai import types

# Multithreading
from tqdm import tqdm
import time
import re

import json

# === CONFIGURATION ===

# Initialize the Gemini client with your API key
genai_client = genai.Client(api_key='GOOGLE_GEMINI_API')

# Google Generative AI Embeddings
embeddings = GoogleGenerativeAIEmbeddings(
    model="models/text-embedding-004", 
    google_api_key="GOOGLE_GEMINI_API"
)

# === INDEXING PART ===

# Data Source - PDF
pdf_path = Path(__file__).parent / "file_path.pdf"

# Load the document from the PDF file
loader = PyPDFLoader(file_path=pdf_path)
docs = loader.load()

# Split the document into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
split_docs = text_splitter.split_documents(documents=docs)

# create a vector store - if collection does not exist
vector_store = QdrantVectorStore.from_documents(
  documents=[],
  url="http://localhost:6333",
  collection_name="collection_name", # Name of collection in Qdrant Vector Database
  embedding=embeddings
)

# Add the documents to the vector store
vector_store.add_documents(split_docs)

# Retrieval part
retriever = QdrantVectorStore.from_existing_collection(
  url="http://localhost:6333",
  collection_name="collection_name", # Name of collection in Qdrant Vector Database
  embedding=embeddings
)

user_query = "What is Node.js and how does it work?" # User Query

CoT_system_prompt = f"""
You are helpful AI Assistant. 
Your task is to create a step by step plan and think how to answer the user's Query
and provide the output steps in JSON format. Last step should be the user query.

Rule:
1. Follow the strict JSON output as per Output schema.
2. Always perform one step at a time and wait for next input
3. Carefully analyse the user query
4. Do not repeat the same step
5. Perform Maximum 4 steps

Example: 
Query: What is FS Module in NodeJS?
Output: {{ step: "thinking", content: "What is FS?" }}
Output: {{ step: "thinking", content: "What is Module?" }}
Output: {{ step: "thinking", content: "What is NodeJS?" }}
Output: {{ step: "thinking", content: "What is FS Module in Nodejs?" }}
"""

response = genai_client.models.generate_content(
    model="gemini-1.5-pro-latest",
    contents=user_query,
    config=types.GenerateContentConfig(system_instruction=CoT_system_prompt)
)

# Extract all full JSON blocks 
json_blocks = re.findall(r'```json\s*(\{.*?\})\s*```', response.text, re.DOTALL)

# Parse each block into dictionary
step_thoughts = [json.loads(block) for block in json_blocks]

print(step_thoughts)

# === RETRIEVAL PART ===

# Sleep with visible progress bar
def wait_with_progress(seconds):
    print(f"\n⏳ Waiting {seconds} seconds to avoid quota limits...\n")
    for _ in tqdm(range(seconds), desc="Sleeping", ncols=100):
        time.sleep(1)

final_answers = []

for step in step_thoughts:

    query = step["content"]

    if final_answers:
      previous_knowledge = "I know: " + " | ".join([ans["answer"] for ans in final_answers])
      query = f"{previous_knowledge}. Question: {step['content']}"

    print(f"\n🍳: {query}\n")
    print(f"🧠: {step}")

     # Retrieve relevant documents for this step
    docs = retriever.similarity_search(query) 

    # Combine all retrieved chunks
    context = "\n".join([doc.page_content for doc in docs])

    # Feed to Gemini for answering this step
    prompt = f"""
    Based on the following context, answer this question:

    Context:
    {context}

    Question:
    {query}
    """

    # 🕒 Wait 50 seconds before each request to avoid quota errors in free api 
    wait_with_progress(50)

    response = genai_client.models.generate_content(
    model="gemini-1.5-pro-latest",
    contents=prompt
    )

    step_answer = response.text.strip()
    final_answers.append({"question": query, "answer": step_answer})

# === GENERATION PART ===

# Prepare final answer from combined steps
combined_context = "\n".join([item["answer"] for item in final_answers])
final_user_query = step_thoughts[-1]["content"]  # last step is original query

final_prompt = f"""
Using the following reasoning and context, 
answer the final user query in a detailed yet simple way:

Context:
{combined_context}

Final Question:
{final_user_query}
"""

final_response = genai_client.models.generate_content(
    model="gemini-1.5-pro-latest",
    contents=final_prompt
)

# Display Final Answer
print("\n🎯 Final Answer to User Query:")
print(final_response.text.strip())

HyDE - Hypothetical Document Embedding

Hypothetical Document Embedding is a technique where we first write the document based on user query by LLM and after we perform the similarity search to find the relevant chunks using this generated document.

This generated document or answer is known as a “Hypothetical Document”.

But LLM should know the answer, so it works well only for Large Context Model (GPT 4.1).

This Technique is not good for Legal Document so be aware for Legal Docs.

Full Code - Gemini AI ( INDEXING , RETRIEVAL and GENERATION )

from pathlib import Path
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
# from langchain_openai import OpenAIEmbeddings
from langchain_google_genai import GoogleGenerativeAIEmbeddings

# Vector Store 
from langchain_qdrant import QdrantVectorStore

# GOGGLE GENERATIVE AI
from google import genai
from google.genai import types

pdf_path = Path(__file__).parent / "file_path.pdf"

# === CONFIGURATION ===

# Initialize the Gemini client with your API key
genai_client = genai.Client(api_key='GOOGLE_GEMINI_API')

# Google Generative AI Embeddings
embeddings = GoogleGenerativeAIEmbeddings(
    model="models/text-embedding-004", 
    google_api_key="GOOGLE_GEMINI_API"
)


# Load the document
loader = PyPDFLoader(file_path=pdf_path)

# Create a list of documents
docs = loader.load()

# Split the document into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
split_docs = text_splitter.split_documents(documents=docs)


# Google Generative AI Embeddings
embeddings = GoogleGenerativeAIEmbeddings(
  model="models/text-embedding-004", 
  google_api_key = "GOOGLE_GEMINI_API"
)

# Create a vector store - if collection exists
# vector_store = QdrantVectorStore.from_existing_collection(
#   url="http://localhost:6333",
#   collection_name="GenAI",
#   embedding=embeddings
# )

# create a vector store - if collection does not exist
vector_store = QdrantVectorStore.from_documents(
  documents=[],
  url="http://localhost:6333",
  collection_name="collection_name", # Name of collection in Qdrant Vector Database
  embedding=embeddings
)

# Add the documents to the vector store
vector_store.add_documents(split_docs)

# Retrieval part
retriever = QdrantVectorStore.from_existing_collection(
  url="http://localhost:6333",
  collection_name="collection_name", # Name of collection in Qdrant Vector Database
  embedding=embeddings
)

user_query = "What is Node.js and how does it work?"

system_prompt = f"""
You are a helpful assistant who answer user's query.

Example:

Query: What is Node.js and how does it work?
Answer: Node.js is a JavaScript runtime environment that 
allows you to execute JavaScript code outside of a web browser. 
It is used for server-side development and building APIs.
"""

# Google Generative AI
client = genai.Client(api_key='AIzaSyB7auBxAy313T9TsXTnkkGArQ96W1anuH4')

response = client.models.generate_content(
    model='gemini-2.0-flash-001',
    contents=user_query,
    config=types.GenerateContentConfig(
        system_instruction=system_prompt
    ),
)

hypothetic_answer = response.text

relevant_chunks = retriever.similarity_search(
  query=hypothetic_answer,
)

# Generation part

final_system_prompt = f"""
You are a helpful assistant who answer user's 
query by using the following pieces of context.
If you don't know the answer, just say that 
you don't know, don't try to make up an answer.

context: {relevant_chunks}
"""

# Google Generative AI
client = genai.Client(api_key='GOOGLE_GEMINI_API')

final_response = client.models.generate_content(
    model='gemini-2.0-flash-001',
    contents=user_query,
    config=types.GenerateContentConfig(
        system_instruction=final_system_prompt
    ),
)
print(final_response.text)

Here is Full Blog of Query Transformation Patterns, I explain everything which you have to know and how to implement this patterns into the RAG’s Systems using GEMINI API.

#chaicode #GenAI #GenAICohort

Query Transformation Patterns

Table of contents