Reciprocal Rank Fusion is similar to Parallel Query Retrieval , just after retrieving instead of simply finding Unique Documents, we find their Ranks based on Frequency and Order , that’s how we get even more Relevant Context for User Query .

Reciprocal Rank Fusion :

What is it ?

It's a technique where based on User Query, LLM provides us relevant queries then retrieve docs from them parallelly ( same till here like parallel query retrieval ) then instead of finding just unique documents from here, we do give Ranking to the retrieved Docs based on their appearance from all queries, and using those ranks we prioritize them for Context, hence providing more relevant context to LLM for better Response .

Where do we apply this in RAG ?

RAG contains , three major steps, Indexing Retrieving Generation , now Indexing is storing Data sources in Database by creating Vector embeddings of data chunks , Retrieving process starts, after receiving User Query to get relevant data and we pass it as context with user Query to Generation part, which finally generated Response.
So, this RRF technique is applicable at second step , i.e RETRIEVAL

How does it work ?

On Receiving user query, we ask our LLM to create some similar queries ( lets say Query Variations )
We process those Query Variations in parallel , by creating their vector embeddings and performing Semantic Search on them to get relevant data ( lets say Docs ).
All Query Variations, outputs some Doc, then we Rank them, on basis of their Frequency of occurrence and order, then based on ranking we prioritize those documents, and as per program pass them as Context.
At Generation process, with our provided Context with User Query ( Original ), LLM finally outputs more accurate Response .

How Response got More Accurate ?

We increased Context ( more relevant context and more in data ) , with that augmented context from similar Queries and ranking their outputs, we got more Precise Response aligned with the user Query.

How is it different from Normal RAG ?

In Normal RAG, we do retrieval process by creating Vector embeddings on User Query directly, then searching for Semantically related data in Database to find Context, whereas in here, we are creating relevant queries and performing Retrieval step on them parallelly , then ranking to get Prioritized more Relevant Context to fetch response from .

Working Step by Step with Code & Visual :

From user-prompt, we ask our LLM to generate similar more queries ( lets say 3 queries )
then, we do find their vector embedding’s and perform semantic search to get relevant data
next, we filter out relevant data to get only unique Data, then we rank it and prioritize getting top four documents out of all fetched

Now, we pass user-prompt ( Original ) with our Context to LLM , to get Final RESPONSE .

Reciprocal Rank Fusion Output :

Important Links:

RECIPROCAL RANK FUSION Code - Visit Here!

I have discussed all Advanced RAG techniques, check out! Advanced RAG Article Series

Advanced RAG Series Repository → Visit Repo Here!

Conclusion:

Just Explained my learning’s on RECIPROCAL RANK FUSION Technique ! if you find it useful then don’t forget to like this Article & Follow Me for more such informative Articles.

Credits:

Credits: I am very grateful to ChaiCode for Providing all this knowledge, Insights , Deep Learning about AI : Piyush Garg Hitesh Choudhary

If you want to learn too, you can Join here → Cohort || Apply from ChaiCode & Use NAKUL51937 to get 10% off

Thanks:

Feel free to Comment your thoughts, Would love to hear your feedback!

Thanks for Giving your Precious time , reading this article.

Connect on other Platforms:

Let’s learn something Together: LinkedIn , Twitter

If you would like , you can Check out my Portfolio

Reciprocal Rank Fusion – A Query Transformation Technique for Advanced RAG

Table of contents