Take a Step Back in RAG

Nitesh SinghNitesh Singh
6 min read

Inspired by the fact that when humans face challenges, they often step back and do abstractions to arrive at high-level principles to guide the process, step-back prompting was proposed to ground reasoning on abstractions to reduce the chance of making errors in the intermediate reasoning steps.

The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise. — Edsger W. Dijkstra

What is Step Back Prompting?

Step Back Prompting is a technique where a more general and broader question is resolved instead of finding the answer to the question given to the user. This helps the model to focus on abstract principles before reasoning through details.

Step-back prompting involves two steps:

  1. Abstraction

  2. Reasoning

Let’s understand this with an example:

  • Original Question: “What happens to the pressure, P, of an ideal gas if the temperature is increased by a factor of 2 and the volume is increased by a factor of 8 ?”

    Here, the LLM can deviate from the first principle of the Ideal Gas Law when reasoning directly on the question.

  • Abstraction Question: “What fundamental concepts or principles are involved in this problem?”

    The abstraction question encompasses a high-level concept of the original question.

  • Abstraction Output: “The Ideal Gas Law: PV = nRT, where P is the pressure, V is the volume, n is the number of moles, R is the gas constant, and T is the temperature.”

    A high-level concept of the original question is retrieved.

  • Reasoning Question: “Use the Ideal Gas Law: PV = nRT to answer the question: What happens to the pressure of an ideal gas when temperature increases by a factor of 2, and volume increases by a factor of 8?“

    After retrieving the principle, the model is guided to solve the original problem using the principle.

  • Final Output: “Apply the ideal gas law to determine the pressure.

    If the temperature is increased by a factor of 2, then T becomes 2T. If the volume is increased by a factor of 8, then V becomes 8V.

    Substituting these values into the ideal gas law, we get: P(8V) = nR(2T)

    Dividing both sides by 8V, we get: P = nR(2T) / 8V

    We can see that the pressure has decreased by a factor of 4.“

Stephen Curry's Trainer Shared How Ridiculous His Workouts Have Gotten -  The Spun

When to use?

Step-back prompting in RAG is most effective in the following scenarios:

  • Complex Questions

  • Mult-Step Reasoning

  • Knowledge Intensive Questions

Step-Back Prompting improves the performance of LLMs across fact-seeking, commonsense reasoning, and domain-specific reasoning.

Working

The following are the steps involved in the generation of output using the step-back prompting:

  1. Generating the user prompt.

  2. Generation of a broader question using LLM and the user prompt.

  3. Retrieval of relevant documents for the broader question.

  4. Generation of the final output by the LLM using relevant documents and the user prompt.

Code Implementation

In this article, we will discuss the retrieval part of the RAG using step-back prompting and the complete implementation of a simple RAG as discussed in the previous article, Introduction to RAG.

Python, Langchain, Qdrant, and OpenAI were used for the code.

Retrieval

A step-back prompt is initially given, instructing the LLM to provide a broader and more general form of the given user query.

self.step_back_prompt = """
            You are an expert at world  knowledge. 
            Your task is to rephrase the given question into a more general form that is easier to answer.

            # Example 1
            Question: How to improve Django performance?
            Output: what factors impact web app performance?

            # Example 2
            Question: How to optimize browser cache in Django?
            Output: What are the different caching options?

            # Example 3
            Question: Which position did Knox Cunningham hold from May 1955 to Apr 1956?
            Output: Which positions have Knox Cunning- ham held in his career?

            # Example 4
            Question: Who was the spouse of Anna Karina from 1968 to 1974?
            Output: Who were the spouses of Anna Karina?

            # Example 5
            Question: Which team did Thierry Audel play for from 2007 to 2008?
            Output: Which teams did Thierry Audel play for in his career?

            Question: {question}
            Output:
        """

A retrieval chain is created using Langchain. ChatPromptTemplate is used to convert the prompt’s string format to the appropriate format for LLM. The OpenAI GPT model is used as the LLM model, and StrOutParser transforms the LLM output to a string format. Qdrant is used for vector store and retrieving the relevant documents.

def get_relevant_chunks(self, llm, retriever, user_prompt):
        step_back_prompt_template = ChatPromptTemplate.from_template(self.step_back_prompt)

        retrieval_chain = (
            step_back_prompt_template
            | llm
            | StrOutputParser()
            #How does changing temperature and volume affect the pressure of an ideal gas?
            | retriever
        )

The retrieval chain is invoked with the user prompt.

relevant_chunks = retrieval_chain.invoke(
            {"question": user_prompt}
        )

The relevant documents are returned.

return relevant_chunks

Here is the complete code for retrieval using step-back prompting.

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

class Step_Back:
    def __init__(self) -> None:
        self.step_back_prompt = """
            You are an expert at world  knowledge. 
            Your task is to rephrase the given question into a more general form that is easier to answer.

            # Example 1
            Question: How to improve Django performance?
            Output: what factors impact web app performance?

            # Example 2
            Question: How to optimize browser cache in Django?
            Output: What are the different caching options?

            # Example 3
            Question: Which position did Knox Cunningham hold from May 1955 to Apr 1956?
            Output: Which positions have Knox Cunning- ham held in his career?

            # Example 4
            Question: Who was the spouse of Anna Karina from 1968 to 1974?
            Output: Who were the spouses of Anna Karina?

            # Example 5
            Question: Which team did Thierry Audel play for from 2007 to 2008?
            Output: Which teams did Thierry Audel play for in his career?

            Question: {question}
            Output:
        """

    def get_relevant_chunks(self, llm, retriever, user_prompt):
        step_back_prompt_template = ChatPromptTemplate.from_template(self.step_back_prompt)

        retrieval_chain = (
            step_back_prompt_template
            | llm
            | StrOutputParser()
            #How does changing temperature and volume affect the pressure of an ideal gas?
            | retriever
        )

        relevant_chunks = retrieval_chain.invoke(
            {"question": user_prompt}
        )

        return relevant_chunks

The relevant documents were given to LLM, and the user prompt generated the following output.

User Prompt

"What happens to the pressure, P, of an ideal gas if the temperature is increased by a factor of 2 and the volume is increased by a factor of 8 ?"

Output

For an ideal gas, the pressure \( P \), volume \( V \), and temperature \( T \) are related by the ideal gas law:

\[
P V = nRT
\]

where \( n \) and \( R \) are constants for a given amount of gas.

If the temperature \( T \) is increased by a factor of 2, then:

\[
T_{\text{new}} = 2T
\]

If the volume \( V \) is increased by a factor of 8, then:

\[
V_{\text{new}} = 8V
\]

Substitute these into the ideal gas law (assuming the number of moles \( n \) remains constant):

\[
P_{\text{new}} V_{\text{new}} = n R T_{\text{new}}
\]

\[
P_{\text{new}} \times 8V = n R \times 2T
\]

Divide both sides by \( 8V \):

\[
P_{\text{new}} = \frac{n R \times 2T}{8V} = \frac{2}{8} \times \frac{n R T}{V} = \frac{2}{8} P = \frac{1}{4} P
\]

So, the pressure decreases to one-fourth of the original pressure.

**Answer:**

When the temperature is doubled and the volume is increased eightfold, the pressure of an ideal gas decreases to one fourth of its initial value.

Conclusion

Step-back prompting is a generic yet straightforward method to elicit deep reasoning via abstraction in large language models. The abstraction helps the models to hallucinate less and reason better, probably reflecting the true nature of the models, which are often hidden while responding to the original question without abstraction. You can get the complete code for RAG using step-back prompting on GitHub.

0
Subscribe to my newsletter

Read articles from Nitesh Singh directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Nitesh Singh
Nitesh Singh