Introduction

COT stands for Chain of Thought. It means Chain-of-Thought (COT) prompting is a method where the AI generates intermediate reasoning steps before reaching the final answer.

COT = Explain steps → Then Final Answer.

"Clear thinking is like a ladder —> step-by-step reasoning takes you higher toward the right answer."

— hitesh choudhary

Architecture

Why do we need COT?

Imagine you're solving a coding problem or making a decision:

If you just guess an answer, you might be wrong.
But if you think out loud step-by-step, you catch mistakes and make smarter decisions.

Chain of Thought makes AI think out loud like a human solving a puzzle carefully.

Let’s take an real life example to understand it better.

Let's say you ask:

Scenario 1: (without COT)

User: "Which phone has better battery life: Phone A or Phone B?"

AI: Phone A is better.

Scenario 2: With COT (Chain of Thought)

User: "Which phone has better battery life: Phone A or Phone B?"

AI: Check battery capacity of Phone A: 5000mAh. Check battery capacity of Phone B: 4500mAh

Check screen refresh rate: Phone A has 120Hz, Phone B has 60Hz (higher refresh rate uses more battery).

Check chipset efficiency: Phone A has a more power-efficient processor. Therefore, Phone A likely has better battery life.

Without COT = Jump to final answer (sometimes wrong).
With COT = Think step-by-step like a smart student → final answer is usually better.

It is especially useful in:

Complex queries (math, logical, multi-step reasoning)
Ecommerce advice (compare specs, weigh pros/cons)
Medical, legal, coding problems (step-by-step checks)

Code

Before you run your program, make sure to install all the below dependency and create the virtual environment. I am using “uv” to create the virtual environment, for more details checkout the video

https://www.youtube.com/watch?v=8mk85fyzevc

To setup Quadrant DB, use the local setup guide, follow the below resources

https://qdrant.tech/documentation/quickstart/

https://www.youtube.com/watch?v=mHrwS6ZoNKc

Algorithm

Step 1 => Load your data (e.g., from PDF / text).
Step 2 => Chunk your data (break into smaller pieces).
Step 3 => Store chunks into a vector database (Quadrant DB).
Step 4 => User asks a question.
Step 5 => Do retrieval (find relevant chunks).
Step 6 => Build a prompt that forces AI to think step-by-step:
Step 7 => Tell AI to "reason step-by-step before answering."
Step 8 => Send this special prompt to the LLM (like GPT).
Step 9 => Return final answer + reasoning chain.

COT implementation

# IMPORTS
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Quadrant
from langchain.schema import Document
from langchain.embeddings import OpenAIEmbeddings
from langchain.llms import OpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

from dotenv import load_dotenv
import os
from typing import List

# LOAD ENV VARIABLES
load_dotenv()

# SET API KEYS
openai_api_key = os.getenv("OPENAI_API_KEY")

# -------------------------------
# STEP 1: LOAD DATA
# -------------------------------
def load_documents(pdf_path: str) -> List[Document]:
    loader = PyPDFLoader(pdf_path)
    return loader.load()

# LOAD FAKE ECOMMERCE PDF
pdf_path = "ecommerce_products.pdf"
documents = load_documents(pdf_path)

print(f"Loaded {len(documents)} documents.")

# -------------------------------
# STEP 2: CHUNK THE DATA
# -------------------------------
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(documents)

print(f"Total Chunks: {len(chunks)}")

# -------------------------------
# STEP 3: STORE IN VECTOR DATABASE
# -------------------------------
# EMBEDDING MODEL
embedding_model = OpenAIEmbeddings(openai_api_key=openai_api_key)
vector_db = Quadrant.from_documents(chunks, embedding=embedding_model)

print("Vector database created.")

# -------------------------------
# STEP 4: USER QUESTION
# -------------------------------
query = "Which mobile phone has best camera quality?"

# RETRIEVE RELEVANT CHUNKS
relevant_docs = vector_db.similarity_search(query, k=5)

# BUILD CONTEXT
retrieved_context = "\n\n".join([doc.page_content for doc in relevant_docs])

# -------------------------------
# STEP 5: CREATE CHAIN OF THOUGHT PROMPT
# -------------------------------
prompt_template = PromptTemplate(
    input_variables=["context", "question"],
    template="""
        You are a smart ecommerce assistant.
        Answer the question based on the context below.
        THINK step-by-step before giving the final answer.

        Context:{context}

        Question:{question}

        Let's think step-by-step:
    """
)

# INITIALIZE LLM
llm = OpenAI(
    temperature=0, 
    openai_api_key=openai_api_key
)

# CREATE CHAIN
chain = LLMChain(llm=llm, prompt=prompt_template)

# -------------------------------
# STEP 6: RUN THE CHAIN
# -------------------------------
response = chain.run(context=retrieved_context, question=query)

# -------------------------------
# STEP 7: FINAL OUTPUT
# -------------------------------
print("\n=== Chain of Thought Reasoning ===\n")
print(response)

Output

Full working code on Github:

https://github.com/gautamkmahato

Conclusion

Chain of Thought (CoT) prompting helps AI models reason more like humans by breaking complex problems into smaller, logical steps. Instead of rushing to an answer, CoT guides the model to "think out loud," leading to better, more accurate responses in tricky situations.

Understanding Chain of Thought (CoT) Prompting: Teaching AI to Think Step-by-Step

Table of contents