Reciprocal Rank Fusion (RRF)

gautam kumargautam kumar
5 min read

Introduction

Reciprocal Rank Fusion (RRF) is a technique used to combine the results from multiple searches or ranking systems. It improves the overall quality of search results by giving higher importance to items that appear near the top of each individual ranking list, ensuring that the most relevant items are prioritized.

Reciprocal Rank Fusion (RRF) is like asking multiple experts for their top picks and then combining their lists to find the best overall choices. It helps in getting the most relevant answers by giving more importance to items that appear at the top of each expert's list.

"In a world of scattered answers, Reciprocal Rank Fusion weaves the best of many into one voice of truth."

hitesh choudhary

Lets take an example of Ecommerce websites and try to understand the Reciprocal Rank Fusion (RRF)

You want to buy a laptop. But you don't want to trust just one website. So you search for "best coding laptop" on:

  • Amazon

  • Flipkart

  • Alibaba

Each site gives you a different list of laptops ranked by their own system. But you want one final best list based on all 3 sites together — so you get the most reliable answer

Lets see the architectural diagram of RRF

Architecture

How Reciprocal Rank Fusion (RRF) helps

  • Look at how high a laptop is ranked on each website.

  • Give more points to laptops that are ranked higher.

  • If a laptop is ranked 1st somewhere, it gets a lot of points.

  • If it's ranked 10th, it gets fewer points, but still some.

  • After adding up points from all websites, the laptops with the most total points come on top.

Even if a laptop isn't #1 everywhere, if it appears good across multiple sites, it becomes top-ranked overall.

Example

  • Laptop A is #2 on Amazon, #5 on Flipkart, #1 on Alibaba.

  • Laptop B is #1 on Amazon, #20 on Flipkart, #10 on Alibaba.

  • Laptop C is #4 on Amazon, #3 on Flipkart, #2 on Alibaba.

Which one is most reliable?

Laptop C — because it's consistently good across all three sites. That's exactly what RRF does — it rewards "good everywhere" products.

RRF with RAG application

Normally, RAG systems retrieve documents from a single source, like one database or vector store. But what if you have multiple sources?

Multiple Sources Could Include:

  • Different databases (e.g., FAQs, User Manuals, Forum Posts)

  • Different retrieval engines (e.g., ElasticSearch, QuadrantDB, Pinecone)

  • Different models (e.g., OpenAI embeddings, HuggingFace embeddings)

Each source might give you a different list of relevant documents, and they won't always agree on which documents are the most important.

Reciprocal Rank Fusion (RRF) with RAG

RRF is a technique that combines these different lists into one final, more reliable list. Here's how it works:

  1. User asks a question.

  2. Query is sent to multiple retrievers (e.g., FAQ DB, Manual DB, Forum DB).

  3. Each retriever gives a ranked list of documents.

  4. RRF merges these lists into one better final list.

  5. Pick the top documents (e.g., top 5).

  6. Send those documents to the LLM (like GPT) to generate the answer.

In RAG, RRF helps pick the best documents from multiple sources, so the LLM answers more accurately, reliably, and smartly.

Code

Before you run your program, make sure to install all the below dependency and create the virtual environment. I am using “uv” to create the virtual environment, for more details checkout the video

https://www.youtube.com/watch?v=8mk85fyzevc

To setup Quadrant DB, use the local setup guide, follow the below resources

https://qdrant.tech/documentation/quickstart/

https://www.youtube.com/watch?v=mHrwS6ZoNKc

Algorithm

Step 1 => Load PDF
Step 2 => Split text into smaller chunks
Step 3 => Embed chunks and store in Quadrant vectorstores
Step 4 => Retrieve documents for a query
Step 5 => Merge the retrieved documents using RRF
Step 6 => Print final ranked results

RRF implementation

# IMPORT LIBRARIES
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Quadrant
from langchain.schema import Document
from langchain.embeddings import OpenAIEmbeddings
from typing import List, Dict
import os
from dotenv import load_dotenv

# LOAD ENVIRONMENT VARIABLES
load_dotenv()

# SET API KEYS
openai_api_key = os.getenv("OPENAI_API_KEY")

# -------------------------------
# STEP 1: LOAD PDF
# -------------------------------

def load_documents(pdf_path: str) -> List[Document]:
    loader = PyPDFLoader(pdf_path)
    return loader.load()

# LOAD ONLY ONE PDF (SIMULATING MULTIPLE SOURCES)
pdf_path = "ecommerce_products.pdf"
docs = load_documents(pdf_path)
print(f"LOADED {len(docs)} DOCUMENTS FROM PDF")

# -------------------------------
# STEP 2: SPLIT INTO SMALLER CHUNKS
# -------------------------------

splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(docs)
print(f"TOTAL CHUNKS CREATED: {len(chunks)}")

# -------------------------------
# STEP 3: SPLIT CHUNKS INTO MULTIPLE SOURCES
# -------------------------------

# DIVIDE CHUNKS INTO 3 SOURCES
chunks_amazon = chunks[0::3]   # EVERY 3RD CHUNK STARTING FROM 0
chunks_flipkart = chunks[1::3] # EVERY 3RD CHUNK STARTING FROM 1
chunks_mart = chunks[2::3]  # EVERY 3RD CHUNK STARTING FROM 2

print(f"AMAZON CHUNKS: {len(chunks_amazon)}")
print(f"FLIPKART CHUNKS: {len(chunks_flipkart)}")
print(f"WALMART CHUNKS: {len(chunks_mart)}")

# -------------------------------
# STEP 4: EMBED THE CHUNKS & STORE IN VECTOR DB
# -------------------------------

# EMBEDDING MODEL
embedding_model = OpenAIEmbeddings(openai_api_key=openai_api_key)

# CREATE VECTORSTORES FOR EACH SOURCE
db_amazon = Quadrant.from_documents(chunks_amazon, embedding=embedding_model)
db_flipkart = Quadrant.from_documents(chunks_flipkart, embedding=embedding_model)
db_walmart = Quadrant.from_documents(chunks_mart, embedding=embedding_model)

print("VECTORSTORES CREATED.")

# -------------------------------
# STEP 5: RETRIEVE DOCUMENTS FOR USER QUERY
# -------------------------------

query = "best mobile phone under 20000"

# TOP 5 RESULTS FROM EACH
amazon_results = db_amazon.similarity_search(query, k=5)
flipkart_results = db_flipkart.similarity_search(query, k=5)
walmart_results = db_walmart.similarity_search(query, k=5)

print("RETRIEVED DOCUMENTS FROM ALL SOURCES.")

# -------------------------------
# STEP 6: APPLY RECIPROCAL RANK FUSION (RRF)
# -------------------------------

def reciprocal_rank_fusion(results: List[List[Document]], k=60) -> List[Document]:
    scores: Dict[str, float] = {}
    docs: Dict[str, Document] = {}

    for result_set in results:
        for rank, doc in enumerate(result_set):
            doc_id = doc.page_content
            score = 1.0 / (rank + 1 + k)
            scores[doc_id] = scores.get(doc_id, 0) + score
            docs[doc_id] = doc

    sorted_doc_ids = sorted(scores.keys(), key=lambda x: scores[x], reverse=True)

    return [docs[doc_id] for doc_id in sorted_doc_ids]

# APPLY RRF TO MERGE RESULTS
merged_results = reciprocal_rank_fusion([amazon_results, flipkart_results, walmart_results])

print("APPLIED RRF TO MERGE DOCUMENTS.")

# -------------------------------
# STEP 7: FINAL RESULT
# -------------------------------

print("\n=== FINAL MERGED TOP RESULTS ===\n")
for idx, doc in enumerate(merged_results):
    print(f"{idx+1}. {doc.page_content[:200]}...")

Output

Full working code on Github:

https://github.com/gautamkmahato

Conclusion

Using Reciprocal Rank Fusion (RRF) with a RAG app is like getting the best picks from multiple search engines and blending them into one awesome answer. It makes your RAG results way smarter, more balanced, and just overall better for real-world use.

0
Subscribe to my newsletter

Read articles from gautam kumar directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

gautam kumar
gautam kumar