πŸ“š What is Reciprocal Rank Fusion (RRF)?

hardik sehgalhardik sehgal
3 min read

Reciprocal Rank Fusion (RRF) is a technique used in Retrieval-Augmented Generation (RAG) systems to combine multiple ranked lists of documents into a single, improved ranking. It's particularly useful for RAG-Fusion, which enhances RAG by generating multiple queries and then using RRF to re-rank the results.

Let me explain RRF like a beginner, then compare it with filter-based fan-out retrieval, and finally show how you can apply it to your multi-query PDF RAG system.

πŸ“š What is Reciprocal Rank Fusion (RRF)?

It works the same as Parallel Query (Fan Out) Retrieval i.e. LLM would generate 3 to 5 different queries based on the original user query. Now imagine you searched 3 slightly different versions of your question in Qdrant and got relevant chunks:

Rewritten QueryTop Results (Ranked)
Query 1πŸ“„A, πŸ“„B, πŸ“„C
Query 2πŸ“„C, πŸ“„D, πŸ“„E
Query 3πŸ“„F, πŸ“„A, πŸ“„D

Now, instead of:

  • Just combining all chunks (πŸ‘Ž duplicates possible)

  • Or filtering for unique ones (πŸ‘Ž loses score/rank info)

You fuse the results intelligently by their rank. This will give the Rank of each chunk. Hence putting the chunks in order based on the similarity with the generated query.

πŸ” RRF vs Parallel Query (Fan Out) Retrieval (Comparison)

FeatureParallel QueryReciprocal Rank Fusion (RRF)
Keeps all chunks.βœ… Yesβœ… Yes
Uses rank from each search.❌ Noβœ… Yes
Gives weight to overlap.❌ All equalβœ… Overlap = more score
Handles chunk quality.❌ Not reallyβœ… Top-ranked chunks score better
Good when...?Results are noisyWhen ranks matter + you want diversity

πŸ’‘ Why RRF Is Great for You

  • In fan-out RAG, different queries return different "views" of the same concept.

  • Some chunks show up across multiple queries = likely to be super relevant

  • RRF promotes these, without needing to manually guess which is best.

Code to get Rankings of each response

before understanding this, you need to understand the working of

  1. RAG

  2. Parallel Query Retrieval

from collections import defaultdict

def reciprocal_rank_fusion(results_list: list[list[str]], k: int = 60) -> list[str]:
    """
    results_list = [
      [chunk_id1, chunk_id2, chunk_id3],  # from query1
      [chunk_id3, chunk_id4, chunk_id5],  # from query2
      ...
    ]
    Returns: ranked list of unique chunk IDs
    """
    scores = defaultdict(float)

    for result in results_list:
        for rank, chunk_id in enumerate(result):
            scores[chunk_id] += 1 / (k + rank + 1)  

    return sorted(scores, key=scores.get, reverse=True)
  1. results_list: This is a list of ranked lists.

  2. scores = defaultdict(float): This creates a dictionary where:

  • Keys = chunk IDs

  • Values = scores (starts at 0.0 by default)

  1. enumerate(result) gives us the rank (position) and the chunk_id(built-in function in Python)

  2. We then calculate the score for each chunk_id

  3. Let’s say k = 60:

    • If chunk1 is at rank 0 β†’ score = 1 / (60 + 0 + 1) = 1/61

    • If it's at rank 1 β†’ score = 1/62

    • So, higher ranked items (closer to top) get more score.

πŸ’‘ Also, if the same chunk_id appears in multiple lists, their scores get added up β€” this is how RRF merges and rewards consensus.

0
Subscribe to my newsletter

Read articles from hardik sehgal directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

hardik sehgal
hardik sehgal