CoT - Chain of Thoughts

Jaskamal SinghJaskamal Singh
5 min read

Combining Chain of Thought (CoT) with Retrieval-Augmented Generation (RAG) involves structuring the RAG process to include step-by-step reasoning, typically involving analyzing retrieved documents before generating a final answer.
This approach, often referred to as CoT-RAG, leverages the strengths of both techniques: CoT for structured reasoning and RAG for access to relevant information.

Ek diagram se samjhte hai :

Ek example se samjhata hoon clear ho jayega :

🤔 Chain of Thought Prompting kya hota hai?

Samjho tum kisi AI se pooch rahe ho:

"Bhai, GDP aur inflation ka kya connection hai?"

Ab agar AI seedha jawaab de de toh kabhi kabhi woh surface-level hoga. Lekin agar AI apna sochne ka process step-by-step bataye, jaise ki ek teacher board pe likh ke samjhata hai, toh woh hota hai Chain of Thought Prompting.


🪜 Simple Analogy — Sochne ki Chain

Jaise mummy ko poochho:

"Mummy, fridge thanda kaise karta hai?"

Mummy seedha answer nahi degi. Pehle bolegi:

  1. “Fridge ke andar compressor hota hai...”

  2. “Compressor gas ko compress karta hai...”

  3. “Phir woh gas evaporate hoti hai aur heat le leti hai...”

  4. “Isiliye fridge thanda ho jaata hai.”

Yeh step-by-step thinking hi toh Chain of Thought hai!


🧠 Chain of Thought in RAG

Jab tum RAG model se kuch complex ya logical query poochhte ho, jaise:

"Why do interest rates affect the stock market?"

Toh AI yeh steps follow karta hai:


🍲 Step 1: Query samjho

AI ko query milti hai:
"Why do interest rates affect the stock market?"


🍜 Step 2: AI sochta hai step-by-step

AI internally yeh chain of thoughts create karta hai:

  1. “Interest rate badhne se loan lena mehenga ho jaata hai.”

  2. “Loan mehenga hone se companies kam paisa uthati hain.”

  3. “Investment slow ho jaata hai, profits girte hain.”

  4. “Isiliye stock market gir sakta hai.”


🍛 Step 3: Chain ke base pe Retrieval

Ab AI har step ke hisaab se context retrieve karta hai — har reasoning ke point pe related documents nikalta hai from vector DB.


🫕 Step 4: Final Answer with Strong Reasoning

AI fir tumhe ek detailed aur samjhdaar jawaab deta hai, jaise ek teacher:

"Interest rates affect the cost of borrowing. When they rise, companies find it expensive to raise capital, leading to lower investment and profitability, which negatively impacts stock prices."


📦 Kyu Useful hai CoT in RAG?

BenefitDesi Example
Better LogicJaise koi cheez step-by-step samjhaaye
Better RetrievalHar step pe alag-alag info laane ka chance
Better OutputJawaab zyada human jaisa, logical aur relatable hota hai

🛍️ Final Punchline :

“Chain of Thought Prompting toh wahi hai bhai, jaise koi chhota baccha sawaal poochta hai, aur mummy patience se poora kahaani suna ke samjhaati hai — har kadam par logic!”

Chalo ab actual code dekhte hai :

from openai import OpenAI
import requests
from dotenv import load_dotenv
import json
import os
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_qdrant import QdrantVectorStore
from pathlib import Path
from collections import Counter
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_qdrant import QdrantVectorStore
from dotenv import load_dotenv
from openai import OpenAI
import os
import ast

load_dotenv()


# Load environment variables from .env file
load_dotenv()
apikey = os.environ["OPENAI_API_KEY"]

# Initialize OpenAI client
client = OpenAI(api_key=apikey)

# 1. Load and split PDF
pdf_path = Path(__file__).parent / "node_js_sample.pdf"
loader = PyPDFLoader(pdf_path)
docs = loader.load()

# Split document into chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200,
)
split_docs = text_splitter.split_documents(documents=docs)

# 2. Create an embedder
embedder = OpenAIEmbeddings(
    model="text-embedding-3-large",
    api_key=apikey
)

# Only run the below once to insert data into Qdrant
# vector_store = QdrantVectorStore.from_documents(
#     documents=split_docs,
#     embedding=embedder,
#     url="http://localhost:6333",
#     collection_name="learning_node_js",
# )
# vector_store.add_documents(documents=split_docs)

# Connect to existing Qdrant vector store

retriever = QdrantVectorStore.from_existing_collection(
    url="http://localhost:6333",
    collection_name="learning_node_js",
    embedding=embedder,
)

print("📄 PDF Ingestion Complete!\n")

question = input("Please enter your question on node js : >  ")

sys_prompt = f"""
You are an intelligent ai assistant who break down the user input into multiple step of thoughts for better undeerstanding of context 
and for gnerrating better outputs


Example:
User: What is machine learning?
Output:
[
   "what is machine ? ",
   "what is learning ? ",
   "what is machine learning ? "
]

now bassed on the ueser input generate the sub prompts
user input : ${question}
"""
print ("\n🧠 LLM thinking ... \n")

cot_prompts = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "system", "content": sys_prompt}]
)

llm_promts_str=cot_prompts.choices[0].message.content
llm_promts = ast.literal_eval(llm_promts_str) 
print("\nCOT Prompts generated by LLm : > ")
for p in llm_promts:
    print(p)

relevant_docs=[]

for q in llm_promts:
    docs= retriever.similarity_search(query=q)
    relevant_docs.extend(docs)

#  if you want to print relevant docs

# for i, d in enumerate(relevant_docs, 1):
#     print(f"\n📄 Result {i}:\n")
#     print(d.page_content)  

context = "\n\n".join([doc.page_content for doc in relevant_docs])

final_prompt = f"""
You are a knowledgeable AI assistant. Use the provided context below to answer the user's question accurately and helpfully.

Context:
{context}

User question: {question}

Answer:
"""
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful AI assistant."},
        {"role": "user", "content": final_prompt}
    ]
)

answer = response.choices[0].message.content
print("\n🤖 Answer:\n")
print(answer.replace("*","").replace("#",""))

OUTPUT :

So that’s all about Chain of Thought Prompting! 🧠✨
Umeed karta hoon ki aapko sab kuch achhe se samajh aaya hoga 🙌😊

Milte hain next blog/post/video mein! 📚🚀
Tab tak ke liye
👉 Keep Learning 📖
👉 Keep Exploring 🔍
👉 Keep Growing 🌱💡

Take care and happy prompting! 🤖💬💛

#ChaiCode

#GENAI

0
Subscribe to my newsletter

Read articles from Jaskamal Singh directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Jaskamal Singh
Jaskamal Singh