Do Prompting Like A Pro: Techniques You Need to Know


Hi, I am back with another topic: prompting. We have all faced the problem of GPT not responding as we want and giving random, irrelevant answers. This isn't just about one model; it happens with all AI, which tends to respond incorrectly quite often.
How can you make them do what you want? How can you get the results you need from them? This is an easy guide on how to do that. Based on your work, there are techniques you can use to get the best output possible. In this blog, we will discuss these techniques and how to implement them. You'll learn how to provide context to your AI model so it can respond effectively.
Let's start by understanding what context actually is.
Context
As discussed in the previous blog, an AI model is a probabilistic machine that provides answers based on the input it receives. Technically, our input determines the quality and richness of the output. This is what context is all about.
Context is the information the model uses to generate its probabilistic output. It includes all the information we provide, questions we've asked previously, metadata, tools we use, and more.
If our context is weak, the model will not understand what we are talking about and will produce poor output. For example, if we ask for a script, the model won't know if we mean a programming script or a film or video script.
Rule number one is that our context should be rich enough for the model to understand what we are talking about.
Context lays the foundation for understanding how to prompt and what is prompting.
Prompting is simply the method of providing what we've just learned. In simpler terms, prompting is the art of giving context to the model so it can better understand our question and increase the chances of providing the correct answer. This can be quite challenging because most AI based applications, or applications in general, want their information to be as accurate as possible. Since AI is essentially a machine of probability, it can be difficult to ensure it generates the correct answer every time. A good prompt can increase your chances of getting the right answer, but it doesn't guarantee it.
So, how do you make your context strong to get the most accurate answer? To achieve this, we need to follow prompting techniques. There isn't a one size fits all solution for getting accurate answers; instead, we have multiple prompting techniques.
Prompting techniques are as follows. I will also include code snippets for better understanding, so you can code along with them.
So, let's start with the basic one.
Zero Shot Prompting
This type of prompting is what we do every day when we ask ChatGPT something like, "write me an email for blah blah." This is called zero shot prompting. We don't provide the model with much context because we assume it will generally give the right answer (or maybe we're just being lazy, haha). We don't give any demonstrations in this type of prompting.
We use this type of prompting when :
we know the task is common
we know it is very clear to understand
the model has been trained on such data
from openai import OpenAI
client = OpenAI()
response = client.responses.create(
model="gpt-4.1",
input="Write a python code that prints me hello world"
)
print(response.output_text)
Above is a clear code example of zero shot prompting. This type of prompting only works when the task is common and easy to understand; otherwise, you might get a poor output.
Few Shot Prompting
Few shot prompting is a technique where we give the model a few examples before asking the real question. It's a great way to provide context to the AI model, but it is not cost efficient. We'll discuss that later. This technique helps the model understand what the user actually wants, allowing it to respond according to your needs.
from openai import OpenAI
client = OpenAI()
prompt = """
Convert the following sentences to passive voice.
Here are some examples:
Active: The cat chased the mouse.
Passive: The mouse was chased by the cat.
Active: She completed the project.
Passive: The project was completed by her.
Now convert:
Active: The team won the match.
"""
response = client.responses.create(
model="gpt-4.1",
input=prompt
)
print(response.output_text)
Here, I used a code example of few shot prompting to convert active sentences to passive. This helps us get the desired output. The issue with this type of prompting is that there is no caching, which leads to higher costs. To save on costs, we need to pair this with another type of prompting, which we will discuss in upcoming topics
Chain Of Thought Prompting
In chain of thought prompting, we don't expect the response immediately. Instead, we want the model to understand the question step by step and then answer it. We also add an extra step that is we pair it up with another technique where the model checks if the given answer is correct.
This type of prompting produces highly accurate results because it breaks the question into smaller parts, making it easier to understand, and it verifies the answer. It's used by many models and applications we use, like Cursor and ChatGPT model o4.
from openai import OpenAI
import json
import os
client = OpenAI()
system_prompt = """
You are an AI assistant that solves complex problems using chain-of-thought reasoning.
Instructions:
1. First, analyze the input to understand what the user is asking.
2. Then perform step-by-step thinking to reach the final understanding — one step at a time.
3. Only use the steps: "analyze" and "think".
4. Each step must follow the format: {"step":"...", "content":"..."}
5. Output must always be valid JSON in this format.
6. Never give the final answer, just think out loud.
Stop reasoning if you believe no further thinking is needed. In that case, respond with done and your final response: {"step":"done","content":"blah blah"}
"""
messages = [
{"role": "system", "content": system_prompt}
]
query = input("> ")
print(query)
messages.append({"role": "user", "content": query})
while True:
result = client.chat.completions.create(
model="gpt-4o",
response_format={"type": "json_object"},
messages=messages
)
parsed_response = json.loads(result.choices[0].message.content)
step = parsed_response.get("step")
content = parsed_response.get("content")
if step == "done":
print(f"✅: {content}")
break
elif step == "analyze":
print(f"🧠 [Analyze]: {content}")
elif step == "think":
print(f"🔍 [Think]: {content}")
else:
print(f"⚠️ [Unknown Step]: {content}")
messages.append({"role": "assistant", "content": json.dumps(parsed_response)})
Here, I have provided a brief example of chain of thought prompting, where we ask the model to think multiple times before giving the output. These steps are used by models or applications where a high level of accuracy is required. However, the cost of this approach is high because, as you know, we prompt the model multiple times before returning the response, which can be expensive.
Self Consistency Prompting
Self consistency prompting is a powerful decoding strategy that improves a model’s accuracy and reliability. Instead of creating just one response, the model is prompted multiple times with the same input, each time following a different reasoning path due to randomness from a higher temperature setting. The most frequent final answer from all these paths is selected as the correct one.
The idea is that correct answers are reached more consistently, even if the reasoning varies a bit. This doesn't involve the model checking or fixing its responses during the process it's more like "ask it 10 times, and choose the answer that comes up the most."
Example :
if you want to calculate 2+2 here is an code based example:
from openai import OpenAI
from collections import Counter
client = OpenAI()
def get_response():
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": "What is 2 + 2? Let's think step by step."}
],
temperature=0.7,
)
return response.choices[0].message.content.strip()
answers = [get_response() for _ in range(5)]
final_answers = []
for ans in answers:
lines = ans.strip().split("\n")
for line in reversed(lines):
words = line.strip().split()
for word in reversed(words):
if word.replace('.', '').isdigit():
final_answers.append(word)
break
if final_answers and final_answers[-1] == word:
break
if final_answers:
most_common = Counter(final_answers).most_common(1)[0]
print("Most consistent answer:", most_common[0])
print("All answers:", final_answers)
else:
print("No valid answers found in model outputs.")
In this example, our model generates 5 outputs and compares them to decide which one is the most consistent. This type of prompting helps us increase accuracy, but as you can see, it also increases the cost.
Instruction Prompting
Instruction Prompting means giving the model a clear and direct command about what you want it to do. It looks like this: "Summarize this text," "Translate the following sentence into French," or "Write a Python function that reverses a string." You're not providing examples or asking the model to think step by step you're simply telling it what to do directly. We use this in our daily lives by giving instructions to the model, and it provides us with the output. This is a type of zero shot prompting we use in our daily live for the example you can refer to the zero shot prompting snippet.
Persona Based Prompting
Persona based prompting is a sophisticated technique that allows us to tailor the model's responses to mimic the tone, style, and mannerisms of a specific person, personality, or profession. By doing this, the model can act as if it is a particular character or individual, providing answers that align with the persona it is emulating. This approach is particularly useful when we want the model to adopt a specific perspective or voice, such as that of a historical figure, a fictional character, or a professional expert in a certain field. By setting the stage for the model to assume a certain role before answering questions, we can achieve responses that are not only informative but also contextually rich and engaging. This method enhances the interaction by making it more relatable and personalized, thereby improving the overall user experience.
example instead of just saying “explain recursion” you can say “imagine you are a senior engineer mentoring a junior dev. Explain recursion simply.”
import openai
client = openai.OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": "You are a calm, empathetic therapist who helps people understand their emotions."
},
{
"role": "user",
"content": "I feel unmotivated and distracted. What should I do?"
}
]
)
print(response.choices[0].message.content.strip())
Here, you can see in the code that I have used the example of a therapist. This type of prompting gives us more control over how we want our output to be or sound.
Role Playing Prompting
This is the technique we used previously, along with the persona based prompting example. In the example, you can see roles like system and user. This helps the AI understand who is providing each prompt and improves its ability to generate the right response for the end user.Here, the system indicates that these are instructions from the system, and the user is the prompt provided by the user for a response.
This prompting technique is combined with other techniques to help us get the right output.
Contexual Prompting
This type of prompting is used when we already have previous context and background information that the model needs to give an accurate response. It is primarily used in chat scenarios, for example, when you need to respond on behalf of someone in a chat. The model needs to have the previous conversation between the user and the person.
from openai import OpenAI
client = OpenAI()
messages = [
{
"role": "system",
"content": (
"You are a helpful and friendly customer support agent for a SaaS company called PriceMate. "
"You are taking over an existing conversation and need to continue from where the previous agent left off. "
"Respond empathetically and clearly."
)
},
{
"role": "user",
"content": (
"Here is the previous conversation:\n\n"
"Customer: I signed up for your competitor tracking tool yesterday but still haven’t received any onboarding email.\n"
"Support Agent: I'm really sorry about the delay! Can you please confirm the email you used to sign up?\n"
"Customer: Yes, it’s mark.jones@email.com\n"
"Support Agent: Thank you! I'm checking this for you now.\n\n"
"Now, please write the next message in this chat from the support agent, explaining that the email was sent but went to the spam folder and how to access it."
)
}
]
response = client.chat.completions.create(
model="gpt-4o",
messages=messages
)
print(response.choices[0].message.content.strip())
Here is an example of customer support chat
MultiModel Prompting
As the name suggests, multimodel prompting involves using two or more models to generate a response to a specific question or task. This approach allows us to leverage the strengths of different models to provide the most accurate and comprehensive answer possible. For instance, one model might excel at understanding and analyzing the prompt, while another might be better at executing the task or generating the final output. By combining these models, we can enhance the overall performance and effectiveness of the response. There are several use cases for multimodel prompting, such as improving the quality of answers, handling complex queries, or integrating diverse functionalities to achieve a more robust solution.
from openai import OpenAI
import os
client = OpenAI()
reviews = """
“I have super sensitive skin and most moisturizers make me break out. This one was light, absorbed well, and no reaction!”
“Smells amazing and makes my skin feel smooth. But the bottle is too small for the price.”
“I’ve been using this cream for 3 weeks and my redness is gone. Please make a bigger size.”
“Didn’t work for me. Left my face greasy. Gave it to my sister and she loved it.”
"""
step1_response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a product researcher analyzing customer feedback."},
{"role": "user", "content": f"Extract the 3-4 main themes from the following customer reviews:\n\n{reviews}"}
]
)
themes = step1_response.choices[0].message.content.strip()
print("🧩 Extracted Themes:\n", themes, "\n")
step2_response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a marketing strategist creating customer personas."},
{"role": "user", "content": f"Based on these product review themes, generate 2 customer personas:\n\n{themes}"}
]
)
personas = step2_response.choices[0].message.content.strip()
print("👤 Customer Personas:\n", personas, "\n")
persona_1 = personas.split("\n")[0]
step3_response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a creative ad copywriter for Instagram."},
{"role": "user", "content": f"Write a short, catchy Instagram ad for the following persona:\n\n{persona_1}"}
]
)
ad_copy = step3_response.choices[0].message.content.strip()
print("📢 Instagram Ad Copy:\n", ad_copy)
Here is an example of multi-model prompting where it extracts themes, generates personas, and creates an Instagram ad copy.
I have attached a table below comparing all the technique(this for i have used gpt hehe the actual cost can vary but its mostly be in this range)
Prompting Technique | Description | When to Use | Cost | Accuracy |
Zero-Shot | No examples, direct prompt | Simple/common tasks | Low | Medium |
Few-Shot | Prompt + few examples | Slightly complex or format-sensitive tasks | Medium | High |
Chain-of-Thought | Step-by-step reasoning | Complex reasoning tasks | High | Very High |
Self-Consistency | Multiple runs, take consensus | Critical accuracy needed | Very High | Highest |
Instruction | Direct, task-based prompt | Simple actions | Low | Medium |
Persona-Based | Prompting with a tone/style | Conversational or UX-sensitive apps | Medium | High |
Contextual | Uses previous chat history | Multi-turn conversations | Medium | High |
Multi-Model | Multiple steps across models | Complex workflows | High | Very High |
These are some types of prompting we use to get the most accurate and cost-effective output from the model. It's not always the case that we use them individually; often, we combine them, like using a chain of thought with self-consistency. Make sure to choose the type of prompting that best fits your task while considering the costs. This will help you get the best output for the price.
In the next blog, we will create our own mini cursor and learn about the AI agent. Stay tuned for that! If you enjoyed my blog, please like it, follow me, and share it. Feel free to comment if you have any questions. Until next time, see you!
If you want to discuss the blog or anything tech-related, you can reach out to me on X or book a meeting on cal com. I'll provide my Linktree below.
Yagya Goel
Socials - link
Subscribe to my newsletter
Read articles from Yagya Goel directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Yagya Goel
Yagya Goel
Hi, I'm Yagya Goel, a passionate full-stack developer with a strong interest in DevOps, backend development, and occasionally diving into the frontend world. I enjoy exploring new technologies and sharing my knowledge through weekly blogs. My journey involves working on various projects, experimenting with innovative tools, and continually improving my skills. Join me as I navigate the tech landscape, bringing insights, tutorials, and experiences to help others on their own tech journeys. You can checkout my linkedin for more about me : https://linkedin.com/in/yagyagoel