What is it?

RAPTOR (Recursive Abstractive Processing for Tree-Organized Retrieval) is a new technique for improving retrieval-augmented language models, particularly for long documents: https://arxiv.org/html/2401.18059v1

Problems addressed

Most existing retrieval methods only retrieve short, contiguous text chunks, limiting their ability to represent large-scale discourse structure and answer thematic questions that require integrating knowledge from multiple parts of a text.

Approach

Recursively embeds, clusters, and summarizes chunks of text
Constructs a tree with different levels of summarization from bottom up
At inference time, retrieves from this tree, integrating information across lengthy documents at different levels of abstraction

Process

The process begins by segmenting text into 100-token chunks and embedding them using SBERT. RAPTOR then employs Gaussian Mixture Models for clustering similar chunks, which are summarized using GPT-3.5-turbo. This process is repeated, building the tree from bottom up:

i. Segments text into 100-token chunks ii. Embeds chunks using SBERT iii. Clusters similar chunks iv. Summarizes clusters using GPT-3.5-turbo v. Repeats process, building tree from bottom up

Querying

Collapsed tree method outperforms tree traversal
Retrieves nodes across all layers based on relevance
Uses cosine similarity for matching

Key features

Builds hierarchical tree of text summaries
Retrieves from multiple abstraction levels
Uses clustering (GMMs) and summarization (LLMs)
Offers flexible querying (tree traversal / collapsed tree)

Benefits:

Outperforms traditional methods (e.g., BM25, DPR) on QA tasks
Excels at complex queries needing multi-part info
Scales linearly with document length
Sets new SOTA on some benchmarks when paired with GPT-4

Evaluation was conducted on NarrativeQA, QASPER, and QuALITY datasets, using metrics such as BLEU, ROUGE, METEOR, F1 score, and Accuracy.

RAPTOR: Tree-based Retrieval for Language Models

Table of contents