Improve RAG Retrieval Using HyDE

Introduction

Most RAG applications convert the user's query into a vector and search for similar patterns in the vector database.
But what if the user's query is vague or worded differently than the actual document?

In such cases, the application might generate results that are slightly irrelevant to the user's intent.

Problem Example

You have a RAG application that answers user queries based on a basic Python documentation.
Here’s a chunk of information stored in your vector database:

"The zip() function in Python allows you to iterate over multiple iterables (like lists) in parallel."

Now a user comes in and asks:

"How to loop over two lists together?"

If we compare this user query with the chunk in the database, we can clearly say that the answer is present.
But here's the problem — the text embedding of the user's query doesn’t match well with the document chunk.

This mismatch happens because the query doesn’t include the keyword "zip()", which is present in the document.

Here's How HyDE Fixes It

The user asks:

"How to loop over two lists together?"
The RAG system (with HyDE) asks the LLM:

"Write an answer to this Python programming question: How to loop over two lists together."
The LLM generates a hypothetical answer:

"You can use the zip() function to loop over two lists in parallel in Python."

This generated answer includes the keyword zip() which is present in the database.
Instead of embedding the original user query, HyDE embeds this hypothetical answer.
Now the vector store easily matches it with:

"The zip() function in Python allows you to iterate over multiple iterables (like lists) in parallel."

🪜 Steps and Code

Load and Split the Document

from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

loader = PyPDFLoader(file_path="python.pdf")
docs = loader.load()

splitter = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=20)
split_docs = docs  # or splitter.split_documents(docs)

Embed and Store in Qdrant

from langchain_openai import OpenAIEmbeddings
from langchain_qdrant import QdrantVectorStore
from qdrant_client import QdrantClient

embedder = OpenAIEmbeddings(model="text-embedding-3-small")
qdrant_client = QdrantClient(url="http://localhost:6333")

vector_store = QdrantVectorStore.from_documents(
    documents=[],
    url="http://localhost:6333",
    collection_name="learning_langchain-hyde",
    embedding=embedder,
    force_recreate=True
)
vector_store.add_documents(split_docs)

Initial Search With Raw Query

retriever = QdrantVectorStore.from_existing_collection(
    url="http://localhost:6333",
    collection_name="learning_langchain-hyde",
    embedding=embedder
)

results = retriever.similarity_search_with_score("How to loop over two lists together?", k=3)
THRESHOLD = 0.7

filtered = [(doc, score) for doc, score in results if score >= THRESHOLD]

if filtered:
    for doc, score in filtered:
        print(f"Score: {score:.2f} → {doc.page_content}")
else:
    print("❌ No relevant data found for this query.")

HyDE Step – Generate Hypothetical Answer

from openai import OpenAI

client = OpenAI()

system_prompt = """
You are an AI Assistant who can take Python queries and answer them correctly and concisely.
"""

response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        { "role": "system", "content": system_prompt },
        { "role": "user", "content": "How to loop over two lists together?" }
    ]
)

hypothetical_answer = response.choices[0].message.content
print("🤖 HyDE-generated:", hypothetical_answer)

Final Search Using Hypothetical Answer

results = retriever.similarity_search_with_score(hypothetical_answer, k=3)

filtered = [(doc, score) for doc, score in results if score >= THRESHOLD]

if filtered:
    for doc, score in filtered:
        print(f"Score: {score:.2f} → {doc.page_content}")
else:
    print("❌ No relevant data found even after HyDE.")

asdas

🧠 Final Thoughts

Traditional RAG systems are powerful, but they often fail when user queries are vague, incomplete, or use different phrasing than the source documents.
HyDE (Hypothetical Document Embeddings) bridges that gap by letting an LLM "guess" a likely answer, then using that richer semantic context to perform retrieval.

Thanks for reading! 🙌
If you found this helpful, share it with your team or drop a ⭐️ on the GitHub repo (if you open source your code).
Let’s keep building smarter, more human-friendly AI systems!

Code Links.

Github - Gen Ai HyDE Example Project

Jupyter Notebook For Above Example

Enhancing RAG Retrieval with HyDE

Introduction

Problem Example

Here's How HyDE Fixes It

🪜 Steps and Code

Load and Split the Document

Embed and Store in Qdrant

Initial Search With Raw Query

HyDE Step – Generate Hypothetical Answer

Final Search Using Hypothetical Answer

🧠 Final Thoughts

Code Links.

Subscribe to my newsletter

HiddenAIgent

HiddenAIgent