Prompt Engineering: The Art of Effective AI Communication


Why the most criticized AI skill might be the most valuable one you'll learn.
You must have seen pictures like this. It’s all over the internet.
Haha😉, the hate is real.
I also hated this term until I understood it better.
I'm not here to convince you to change your mind about "Prompt Engineering."
Instead, I'll leave that decision to you. But I do suggest that before making a final judgment, we should at least try to understand what “Prompt Engineering” is.
So, let's give it a try, shall we?
Does this tweet make any sense to you? If not, that’s completely fine because trust me, after reading this article, you will have a clear picture of why someone said so.
Before we jump into the topic, let me answer some of the hottest questions on the internet.
Is Prompt Engineering only a software developer thing?
I am a physician/doctor/graphic designer/painter/businessman/creative writer/architect … (list goes on). Should I learn prompt engineering?
— My thoughts on these questions —→ No, Prompt Engineering is not only for Software Developers. It’s a universal skill for anyone willing to skill up in their respective domain using the advantage of LLMs. And let me tell you, using AI in your domain is not a typical big NO NO anymore. Whether you are a doctor looking for a precise medical query or a graphic designer trying to get creative inspiration, or a singer looking for the tune that matches your rhythm, knowing how to communicate to get the best out of these language models amplifies your expertise.
Think of it as learning a new language that turns AI into a powerful assistant in your field. Investing time in learning prompt engineering benefits you in the long run, no matter your profession.
Now, with a positive attitude, let's move forward.
What is a prompt?
- A prompt is simply an instruction or a question you give to an LLM to get it to do something for you. A prompt can be a single word, a sentence, or even a detailed paragraph, depending on what you're trying to accomplish.
Now, what is Prompt Engineering?
- Prompt engineering is the process of designing high-quality prompts that guide LLMs to produce accurate outputs.
When you write a prompt, you are attempting to set up the LLM to predict the right sequence of tokens.
You don’t need to be a data scientist or a machine learning engineer – everyone can write a prompt. However, writing a prompt that suits your requirements can be tricky. Because it depends on multiple factors like - Model Configuration, your requirements, choice of words, tune of words, examples (if there are any), structure of the question, context of the question, etc.
An ambiguous question will result in an ambiguous answer. As simple as that. If you ask a question that already sounds confusing to you, most probably it will confuse LLM models too and will result in hallucination.
As I said, it’s more than the hate we see on the internet.
In a recent speech at YC AI Startup School, Andrej Karpathy discussed the psychology of LLMs. — He characterizes LLMs as "fallible savants" with unique cognitive quirks. They can be superhuman at some tasks but make mistakes that no human will make. He said LLMs are like a patient who is diagnosed with Anterograde Amnesia, which means they can't form new long-term memories within a conversation beyond their context window.
So now it is fair to say that it is certainly not any other buzzword, but a universal skill to move forward.
Some Jargon to Define LLMs’ Output
Output Length
Although it seems straightforward, many people find it confusing. It simply refers to how many tokens the model should use in its response. It doesn't guarantee the quality of the response, just the length.
Keep in mind: a long response means using many tokens, which means LLMs use more computing power, making the response more expensive.
A shorter response uses fewer tokens, so LLMs use less computing power, making the response less expensive.
Reducing the output length of the LLM doesn’t cause the LLM to become more stylistically or textually succinct in the output it creates, it just causes the LLM to stop predicting more tokens once the limit is reached.
Sampling Control ( It’s the settings that define how the next token will be chosen )
Language Models do not predict any single token; it’s just predict the probability of what the next token could be in LLM’s vocabulary. Given a list of probabilities, now it’s up to the Sampling Control settings that determine how these probabilities are used to choose one output token.
Some of the most common sampling control configurations ——
2a. Temperature: Creativity Control Room
- Temperature — (You may think I know this one, I know this one, this is classic. No, it’s not what you may have convinced yourself to think.) Temperature simply defines how much creativity/risk LLM should consider with word choices.
How It Works: The Pizza Restaurant Analogy
Imagine you're at a pizza restaurant and the waiter asks what you want. The menu shows popularity ratings:
Margherita Pizza: 45% of customers order this
Pepperoni Pizza: 30% of customers order this
Hawaiian Pizza: 15% of customers order this
Veggie Supreme: 8% of customers order this
Anchovy Special: 2% of customers order this
Temperature controls how much you stick to popular choices vs. trying something different.
Let’s take another example :
—— " The cat walked into the room and saw a..."
Now, let's use this sentence to see how different 'Sampling Control' settings affect the LLM's word choices.
The LLMs internal probability distribution might look like:
"mouse" (35%)
"ball" (20%)
"person" (18%)
"toy" (12%)
"shadow" (8%)
"ghost" (4%)
"unicorn" (2%)
"spaceship" (1%)
Now let's see how different temperature settings affect the output:
Temperature = 0 (Greedy Decoding)
Behavior: Always picks the highest probability word, no exceptions.
Output: "The cat walked into the room and saw a mouse."
Every single time: "mouse" (because it has 35% probability - the highest)
Why use this:
Math problems: "2 + 2 = 4" (not "2 + 2 = purple")
Code generation: Variable names should be consistent
Factual questions: "Paris is the capital of France" (not "Paris is the capital of pizza")
Temperature = 0.2 (Very Low)
Behavior: Heavily favors high-probability words, but allows tiny variations.
Possible outputs:
"The cat walked into the room and saw a mouse." (90% of the time)
"The cat walked into the room and saw a ball." (8% of the time)
"The cat walked into the room and saw a person." (2% of the time)
Why use this:
Professional emails: Consistent tone, occasional word variety
Technical documentation: Clear but not robotic
Customer service responses: Reliable but slightly personalized
Temperature = 0.5 (Low-Medium)
Behaviour: Still prefers likely words but gives reasonable alternatives a fair chance.
Possible outputs:
"The cat walked into the room and saw a mouse." (50% of the time)
"The cat walked into the room and saw a ball." (25% of the time)
"The cat walked into the room and saw a person." (20% of the time)
"The cat walked into the room and saw a toy." (5% of the time)
Why use this:
Blog writing: Natural variety without being weird
Conversational AI: Engaging but predictable
Content creation: Fresh but coherent
Temperature = 0.8 (Medium-High)
Behavior: Good balance - explores less likely options while staying sensible.
Possible outputs:
"The cat walked into the room and saw a mouse." (25% of the time)
"The cat walked into the room and saw a ball." (20% of the time)
"The cat walked into the room and saw a person." (20% of the time)
"The cat walked into the room and saw a toy." (15% of the time)
"The cat walked into the room and saw a shadow." (12% of the time)
"The cat walked into the room and saw a ghost." (8% of the time)
Why use this:
Creative writing: Interesting but logical
Story generation: Surprising but believable plot points
Dialogue writing: Natural human-like conversation
Temperature = 1.2 (High)
Behavior: Much more willing to pick unusual options, probability differences matter less.
Possible outputs:
"The cat walked into the room and saw a unicorn."
"The cat walked into the room and saw a spaceship."
"The cat walked into the room and saw a mouse wearing a tiny hat."
"The cat walked into the room and saw a portal to another dimension."
"The cat walked into the room and saw a philosophical debate."
Why use this:
Brainstorming: Need completely unexpected ideas
Surreal creative writing: Fantasy, sci-fi, experimental fiction
Idea generation: Breaking out of conventional thinking
Temperature = 2.0+ (Very High)
Behavior: Almost random - all words become nearly equally likely.
Possible outputs:
"The cat walked into the room and saw a refrigerator singing opera."
"The cat walked into the room and saw a mathematics of purple seventeen."
"The cat walked into the room and saw a democracy flavored with nostalgia."
Why use this:
Abstract art generation
Surreal poetry
Random idea sparks (though often nonsensical)
2b. Top-K: Limiting the Menu
What it does: Only considers the K most likely words, ignoring the rest.
Top-K = 1
Available choices: Only "mouse" (the single highest probability word)
Output: "The cat walked into the room and saw a mouse." (always exactly the same)
Why use this: When you need 100% predictable results. Same as setting temperature to 0.
Top-K = 3
Available choices: "mouse," "ball," "person" (top 3 most likely)
Possible outputs:
"The cat walked into the room and saw a mouse."
"The cat walked into the room and saw a ball."
"The cat walked into the room and saw a person."
Why use this: Very controlled creativity - safe options only.
Top-K = 6
Available choices: "mouse," "ball," "person," "toy," "shadow," "ghost" (top 6)
Possible outputs:
"The cat walked into the room and saw a shadow."
"The cat walked into the room and saw a ghost."
(Plus all previous options)
Why use this: Moderate creativity - includes some interesting but reasonable choices.
Top-K = 50
Available choices: All 8 words from our list, plus 42 other possible words like "butterfly," "mirror,"
"photograph," "rainbow," "telephone," etc.
Possible outputs:
"The cat walked into the room and saw a butterfly."
"The cat walked into the room and saw a mysterious photograph."
"The cat walked into the room and saw a ringing telephone."
Why use this: High creativity - many unexpected but sensible possibilities.
2c. Top-P (Nucleus Sampling): The Probability Threshold
What it does: Includes words until their combined probability reaches P, then stops.
Top-P = 0.5 (50% threshold)
Process: Keep adding words until we hit 50% total probability
"mouse" (35%) → Running total: 35%
"ball" (20%) → Running total: 55% ← STOP! We passed 50%
Available choices: Only "mouse" and "ball"
Possible outputs:
"The cat walked into the room and saw a mouse."
"The cat walked into the room and saw a ball."
Why use this: Very safe, predictable outputs. Good for professional writing where you can not afford weird
word choices.
---------------------------------------------------------------------------------------------------------
Top-P = 0.75 (75% threshold)
Process: Keep adding until 75%
"mouse" (35%) → 35%
"ball" (20%) → 55%
"person" (18%) → 73%
"toy" (12%) → 85% ← STOP! We passed 75%
Available choices: "mouse," "ball," "person," "toy"
Possible outputs:
"The cat walked into the room and saw a mouse."
"The cat walked into the room and saw a person."
"The cat walked into the room and saw a toy."
Why use this: Good balance - includes reasonable alternatives while filtering out the weird stuff.
---------------------------------------------------------------------------------------------------------
Top-P = 0.95 (95% threshold)
Process: Keep adding until 95%
"mouse" (35%) → 35%
"ball" (20%) → 55%
"person" (18%) → 73%
"toy" (12%) → 85%
"shadow" (8%) → 93%
"ghost" (4%) → 97% ← STOP! We passed 95%
Available choices: "mouse," "ball," "person," "toy," "shadow," "ghost"
Possible outputs:
"The cat walked into the room and saw a shadow."
"The cat walked into the room and saw a ghost."
(Plus all the previous options)
Why use this: Most common setting - gives creativity while avoiding the really bizarre choices like "unicorn" and
"spaceship."
---------------------------------------------------------------------------------------------------------
Top-P = 1.0 (100% threshold)
Process: Include everything, no matter how unlikely
Available choices: Every single word, including "unicorn" and "spaceship"
Possible outputs:
"The cat walked into the room and saw a unicorn."
"The cat walked into the room and saw a spaceship."
"The cat walked into the room and saw a philosophical debate."
Why use this: Maximum creativity, but risk of nonsensical results.
The best way to choose between top-K and top-P is to experiment with both methods (or both together) and see which one produces the results you are looking for.
Now that we have the whole picture of how LLMs determine the possible token, it becomes really important to know that we use all these configurations together while forming a prompt, because the end goal here is being able to write an appropriate prompt based on our requirements.
So, how do They Work Together:-
The AI applies these filters in sequence:
First, apply Top-K and Top-P filters to create a shortlist
Then, use temperature to sample from that shortlist
Example:
Settings: Temperature=0.7, Top-K=10, Top-P=0.8
For "Write a story about a dragon who...":
Top-K limits to 10 most likely next words: "lived," "flew," "breathed," "loved," "feared," "discovered," "lost,"
"wanted," "dreamed," "fought"
Top-P removes words until cumulative probability ≤ 80%: keeps "lived," "flew," "breathed," "loved," "feared"
Temperature=0.7 samples from these 5 options with moderate randomness
Result: "Write a story about a dragon who feared the dark."
Common Problems and Solutions
The "Repetition Loop Bug"
Problem: AI gets stuck repeating the same phrases over and over.
Example of bad output: "The solution is simple, the solution is simple, the solution is simple, the solution is..."
Why does it happen?
Low temperature: AI becomes too rigid, follows the same probability path
High temperature: Random choices accidentally cycle back to the previous text
Solution: Adjust temperature to 0.3-0.7 range and use Top-P around 0.9-0.95
Extreme Settings Cancel Each Other Out
Temperature = 0: Makes Top-K and Top-P irrelevant (always picks the most probable)
Top-K = 1: Makes temperature irrelevant (only one choice available)
Top-P = 0: Makes other settings irrelevant (only the most probable word allowed)
Quick Reference
Here is a table of different use case scenarios with various sampling control configurations.
Task Type | Temperature | Top-P | Top-K | Example Use |
Math/Code | 0 | 0.9 | 20 | "What is 2+2?" |
Technical Writing | 0.1-0.2 | 0.9 | 20 | API documentation |
Business Writing | 0.2-0.5 | 0.95 | 30 | Professional emails |
Creative Writing | 0.7-0.9 | 0.99 | 40 | Short stories |
Brainstorming | 0.8-1.0 | 0.99 | 50 | Idea generation |
Conclusion: -
So far, we have seen how to tweak LLM’s output based on your needs through Sampling control configurations. Also discussed some problems and their solutions. Understanding temperature, Top-K, and Top-P settings allows you to fine-tune creativity versus consistency based on your specific needs.
Therefore, the question isn't whether prompt engineering is "real" or just hype. The question is: Are you ready to harness the full potential of AI tools in your field?
Haha 😒 Did you forget that at the start of this article I showed you a tweet saying——
The hottest new programming language is ‘English‘.
- Andrej Karpathy
Now, in the whole article, have you seen anything related to any particular field? No, right. As I said, it is universal.
Isn't it amazing that we can communicate with these incredibly smart language models using just plain, structured English?
We can use English to communicate with LLMs and build applications, businesses, solve our problems, demystify ongoing challenges, and create exciting new opportunities. In the very near future ( Within the next 6 months ), the whole industry will be driven by an agentic AI ecosystem, where your prompts become your intellectual asset of the business. Therefore, it is extremely important to know how we craft our intellectual assets.
And all these are in Simple English, but strategically crafted for specific use cases.
I mean, it’s mind-boggling, right?
See this article to get more insights: Takeaways from the AI Engineer World's Fair: The startup playbook is being rewritten in real-time – GeekWire (It’s not about prompt engineering)
Now, you may assume that, well, I’m good with English. So does that mean —
Well, we will see……
In my next article, we will continue from here onwards, where I’ll write about various prompting techniques.
Feedbacks are appreciated.
AI-Generated Summary
Subscribe to my newsletter
Read articles from Pritam Chakroborty directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
