Reciprocal Rank Fusion gives more accuracy to our RAG

SUPRABHATSUPRABHAT
2 min read

What is Reciprocal Rank Fusion

Reciprocal Rank Fusion is a technique to rank the fetch (search) information that we get after Parallel Query Retrieval. We rank the documents on the basis of occurrence or repetition. in other words, Reciprocal Rank Fusion takes multiple ranked lists of search results (like from different search engines or AI models) and merges them into a single better list. It gives more importance to high-ranking results from each list. So if a result appears near the top in any list, it will be ranked higher in the final combined list.

How Reciprocal Rank Fusion Works

For understanding the Reciprocal Rank Fusion, we need to understand the same diagram of Parallel Query Retrieval.

In this diagram, we have a user query we give to LLM, and it generates 3 queries, and with the help of their queries, we search for documents in the Database. than filter the information and Reciprocal Rank Fusion take place it’s ranking the documents on the basis of occurrence or repetition, In this we rank red document 1st because of it’s occurrence or repetition and 2nd to yellow and 3rd to green than we pass to LLM and also pass the real user query for the same context, and LLM returns the output on the basis of the information.

Code for Ranking the Documents

def reciprocal_rank_fusion(rankings, k=60):
    scores = {}
    for ranking in rankings:
        for rank, doc_id in enumerate(ranking):
            score = 1 / (k + rank + 1)
            scores[doc_id] = scores.get(doc_id, 0) + score
    return sorted(scores.items(), key=lambda x: x[1], reverse=True)

fused = reciprocal_rank_fusion(rankings)
top_ids = [doc_id for doc_id, _ in fused[:5]]

Summary

Reciprocal Rank Fusion (RRF) is a simple yet powerful technique used to combine search results from multiple sources or models. Instead of relying on just one ranking, RRF looks at several ranked lists and gives each result a score based on how high it appears in any of them. Results that rank higher in at least one list get a better overall score, making the final combined list more relevant and balanced. It’s especially useful in search systems and AI applications like RAG, where merging the best results from different retrieval methods improves accuracy without needing complex training.

  1. What is RAG

  2. Parallel Query Retrieval

  3. Chain of Thought

  4. Step-Back Prompting

  5. Hypothetical Document Embeddings

10
Subscribe to my newsletter

Read articles from SUPRABHAT directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

SUPRABHAT
SUPRABHAT