Retrieval-Augmented Generation (RAG) is a transformative approach that integrates document retrieval with language generation, enabling models to answer queries based on specific content, such as PDFs. While RAG excels at fetching relevant information, it can sometimes lack depth for complex or abstract questions. This is where Step-Back Prompting comes in—a technique that broadens the model's perspective by first addressing a foundational question before tackling the user’s specific query. In this blog, we’ll explore Step-Back Prompting in RAG and provide a detailed walkthrough of a Python script that implements it for an interactive PDF query assistant.

What is Step-Back Prompting?

Step-Back Prompting enhances RAG by prompting the language model to "step back" and consider a broader, related question before answering the original query. This two-step process ensures responses are well-grounded in fundamental concepts, improving accuracy and completeness.

For example:

User Query: "How does recursion work in Python?"
Step-Back Query: "What is recursion in programming?"
Outcome: The model first understands recursion as a general concept (a function calling itself with a base case) before explaining its specifics in Python.

This approach is particularly useful for technical or nuanced questions where context is key.

System Overview

Load and Split PDF: Extract text from a PDF and divide it into manageable chunks.
Embed and Store: Convert text chunks into embeddings and store them in a vector database (Qdrant).
Generate Step-Back Query: Use a language model to create a broader question related to the user’s query.
Retrieve Documents: Fetch relevant chunks for both the original and step-back queries.
Construct Prompt: Combine both contexts into a structured prompt.
Generate Response: Use the language model to produce an informed answer.
Interactive Loop: Allow continuous user queries.

Let's dive into writing code and understanding the step-back prompt in RAG.

Code Walk Through

1. Import Required Libraries

from pathlib import Path
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_google_genai import GoogleGenerativeAIEmbeddings, ChatGoogleGenerativeAI
from langchain_qdrant import QdrantVectorStore
import os

2. Load and Process the PDF

pdf_path = Path(__file__).parent / "Python Programming.pdf"
loader = PyPDFLoader(file_path=pdf_path)
docs = loader.load()

3. Split the Document into Chunks

text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=200)
split_docs = text_splitter.split_documents(documents=docs)

4. Generate Embeddings

embedder = GoogleGenerativeAIEmbeddings(
    model="models/text-embedding-004",
    google_api_key="YOUR_API_KEY"
)

5. Set Up the Vector Store

vector_store = QdrantVectorStore.from_documents(
    documents=split_docs,
    embedding=embedder,
    url="http://localhost:6333",
    collection_name="pdf_assistant"
)

6. Initialize the Language Model

llm = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash",
    google_api_key="YOUR_API_KEY"
)

7. Define the System Prompt

SYSTEM_PROMPT = """
You are a smart PDF assistant designed to help users understand a PDF document’s content. Your task is to provide accurate, clear, and concise responses based on the user’s query and the provided PDF excerpts. Use Step-Back Prompting to think more broadly about the query before addressing its specifics. Follow these guidelines:

1. **Query Handling**:
   - For specific queries, first consider the broader context or foundational concepts.
   - For general queries, provide a concise overview.

2. **Use Excerpts Only**:
   - Base your response solely on the provided excerpts.
   - If the info isn’t there, say: "The PDF does not contain this information."

3. **Response Style**:
   - Use simple, clear language.
   - Show your step-back reasoning before giving the final answer.

If the query is unclear, ask for clarification.
"""

8. Generate Step-Back Query

def generate_step_back_query(query):
    step_back_prompt = f"Generate a broader, more general question related to: {query}"
    response = llm.invoke(step_back_prompt)
    step_back_query = response.content.strip()
    return step_back_query

9. Retrieve Documents

def retrieve_documents(vector_store, query, step_back_query, k=3):
    original_docs = vector_store.similarity_search(query, k=k)
    step_back_docs = vector_store.similarity_search(step_back_query, k=k)
    all_docs = original_docs + step_back_docs
    unique_contents = set()
    unique_docs = []
    for doc in all_docs:
        if doc.page_content not in unique_contents:
            unique_contents.add(doc.page_content)
            unique_docs.append(doc)
    return unique_docs

10. Construct the Step-Back Prompt

def construct_step_back_prompt(query, step_back_query, context):
    step_back_prompt = (
        SYSTEM_PROMPT + "\n\n"
        "Based on the following PDF excerpts, answer the question using Step-Back Prompting.\n\n"
        "Step-Back Query: " + step_back_query + "\n\n"
        "Excerpts:\n"
        f"{context}\n\n"
        "Original Question: " + query + "\n\n"
        "Let’s think step-by-step:\n"
        "1. Consider the broader context from the step-back query.\n"
        "2. Relate this context to the original question.\n"
        "3. Formulate a clear, concise answer.\n\n"
        "So, the answer is:"
    )
    return step_back_prompt

11. Generate the Response

def chat_with_step_back(query, vector_store, llm):
    step_back_query = generate_step_back_query(query)
    retrieved_docs = retrieve_documents(vector_store, query, step_back_query)
    context = "\n\n".join([doc.page_content for doc in retrieved_docs])
    step_back_prompt = construct_step_back_prompt(query, step_back_query, context)
    response = llm.invoke(step_back_prompt)
    return response.content

12. Interactive Loop

print("Welcome to the PDF Query Assistant with Step-Back Prompting!")
while True:
    query = input("Ask a question about the PDF (or type 'exit' to quit): ")
    if query.lower() == 'exit':
        print("Goodbye!")
        break
    if not query.strip():
        print("Please enter a valid question.")
        continue
    try:
        answer = chat_with_step_back(query, vector_store, llm)
        print("Assistant:", answer)
    except Exception as e:
        print(f"An error occurred: {e}")

Output:

Conclusion:

Step-Back Prompting enhances Retrieval-Augmented Generation (RAG) by combining a wide-ranging conceptual grasp with precise, detailed information. This approach is particularly effective for addressing complex queries that require both an overarching understanding and specific, nuanced details. By stepping back to consider foundational concepts, it allows for a more thorough exploration of the topic, ensuring that responses are both comprehensive and informative. This method not only clarifies intricate subjects but also provides a structured pathway to navigate through complex information, making it an invaluable tool for tackling challenging questions.

Enhancing Retrieval-Augmented Generation (RAG) with Step-Back Prompting

Table of contents