Reciprocal Rank Fusion : Query Transformation Technique

GarvGarv
2 min read

đź“–Introduction

This article is a part of the Advance RAG series, a series which explains the various tenets and features of Advance RAG systems. In this article, Reciprocal Rank Fusion (RRF), a Query Transformation/Translation Technique is explained along with diagram and code. RRF works on the same principle as that of Parallel Query Retrieval the only difference between that the retrieved document of each query is ranked based on its order & frequency of occurence.


🔀What is RRF ?

In a RAG system there are 3 steps involved :

  1. Indexing the knowledge (external) documents in the vector store in the form of vector embeddings.

  2. Retrieving the relevant document chunks which are semantically similar to the user’s query.

  3. Generating the response to the user’s query by the LLM based on its feeded context, which was the result of the Retrieval step.

To learn in more detail about the steps of a RAG System, see this article here.


Now in the case of Advance RAG where the Retrieval Step is targeted in RRF fashion what actually happens is that :

  1. After the user’s query is received, then multiple version of the same query is created.

  2. For each query the semantic search is performed in the vector store.

  3. Based on the result of each query the relevant document chunks obtained are ranked based on their order and frequency of occurence.


⚡Effect of RRF Query Transformation

Since the user query is transformed into various versions thus increasing the number of relevant document chunks which after ranking and augmenting to the context of the LLM along with the original user prompt leads to more precise response.


📊💻Step By Step Working Through Diagram & Code

  1. From user-prompt, LLM generates similar queries (3 queries here)

  1. Then, vector embeddings are found and semantic search performed to get relevant data.

  1. Next, the relevant data is filtered out to get only unique data which is then ranked and prioritized to get the top n documents (here n=4) out of all the fetched documents.

  1. Now, the original user-prompt is passed with the context to the LLM to get the Final Response.


🏆Reciprocal Rank Fusion Output


  • Reciprocal Rank Fusion Code File

  • Advance RAG Article Series

  • Advance RAG Repository


🎯Conclusion

Through this article you saw how to implement Reciprocal Rank Fusion Technique in your RAG and make the response more efficient and optimised.

0
Subscribe to my newsletter

Read articles from Garv directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Garv
Garv

A person trying to learn and question things.