Building a Thinking Model from a Non-Thinking Model with Chain-of-Thought


Large Language Models (LLMs) are powerful, but in their default form they often act like “non-thinking models.” They generate outputs directly from inputs without showing the reasoning process. While this works for simple tasks, it quickly breaks down for complex reasoning, math, coding, or multi-step decision making.
This is where Chain-of-Thought (CoT) comes in. It transforms a “non-thinking” model into a thinking model by guiding it to reason step-by-step before producing the final answer.
The Problem with Non-Thinking Models:
Imagine you ask a model:
Example: “A farmer has 12 apples. He gives 4 to his friend and buys 6 more. How many apples does he have now?”
A non-thinking model may answer:
“16” (just guessing)
“14” (ignoring the addition)
“18” (lucky guess)
Why? Because it jumps straight to the answer instead of reasoning through the steps.
What is Chain-of-Thought?:
Chain-of-Thought (CoT) is a prompting technique where we encourage the model to explain intermediate steps before the final answer.
Example with CoT prompt:
Prompt:
“A farmer has 12 apples. He gives 4 to his friend and buys 6 more. How many apples does he have now? Think step by step.”
Model Output (CoT):
Start with 12 apples.
He gives 4 away → 12 - 4 = 8.
He buys 6 more → 8 + 6 = 14.
Final Answer = 14.
Now the model thinks out loud, leading to a more reliable answer.
How to Build a Thinking Model from a Non-Thinking Model:
Prompt Engineering (Zero-Shot CoT):
Add phrases like “Let’s think step by step” to encourage reasoning.
Works even on models not explicitly trained for reasoning.
Few-Shot CoT:
Provide the model with examples of reasoning.
Example:
Q: If you have 2 pens and buy 3 more, how many pens do you have? A: 2 + 3 = 5. Answer: 5 Q: A farmer has 12 apples, gives away 4, and buys 6 more. How many apples? A: 12 - 4 + 6 = 14. Answer: 14
This sets a reasoning pattern the model can imitate.
Self-Consistency:
Instead of generating one reasoning path, ask the model to produce multiple CoT outputs and then vote on the most common answer.
Improves reliability in math and logical reasoning.
Externalizing Reasoning:
Capture the CoT outputs, store them, and analyze them for debugging.
Helpful for explainability in production systems.
Speed vs Accuracy Trade-Off:
Without CoT: Fast, cheap, but error for complex tasks.
With CoT: Slower, more tokens, but accurate and interpretable.
Example:
Customer chatbot (simple FAQs) → may not need CoT.
Medical reasoning assistant → must use CoT for safety.
Beyond CoT: Structured Reasoning:
CoT is just the start. Advanced techniques extend this idea:
Tree-of-Thought (ToT): Explore multiple reasoning branches.
Graph-of-Thought: Connect reasoning paths into structured graphs.
Program-Aided Language (PAL): Delegate reasoning steps to external code execution.
These move models closer to human-like problem solving.
Conclusion:
A non-thinking model can feel like a “guessing machine.” By applying Chain-of-Thought prompting, we give LLMs a way to reason transparently, step by step.
Start with “Let’s think step by step.”
Use few-shot reasoning examples.
Improve reliability with self-consistency.
CoT is a small tweak with huge impact, turning LLMs into models that don’t just answer they think.
Subscribe to my newsletter
Read articles from satyasandhya directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
