💡

I'm not an expert in this, but I'm an enthusiastic learner of generative artificial intelligence. All the knowledge I share is verified from reliable sources. Please feel free to point out areas for improvement and how I can do better. I'm definitely looking forward to your valuable feedback. Now go ahead….. 😊

Why the most criticized AI skill might be the most valuable one you'll learn.

You must have seen pictures like this. It’s all over the internet.

Haha😉, the hate is real.
I also hated this term until I understood it better.

I'm not here to convince you to change your mind about "Prompt Engineering."

Instead, I'll leave that decision to you. But I do suggest that before making a final judgment, we should at least try to understand what “Prompt Engineering” is.

So, let's give it a try, shall we?

Does this tweet make any sense to you? If not, that’s completely fine because trust me, after reading this article, you will have a clear picture of why someone said so.

Before we jump into the topic, let me answer some of the hottest questions on the internet.

Is Prompt Engineering only a software developer thing?

I am a physician/doctor/graphic designer/painter/businessman/creative writer/architect … (list goes on). Should I learn prompt engineering?

— My thoughts on these questions —→ No, Prompt Engineering is not only for Software Developers. It’s a universal skill for anyone willing to skill up in their respective domain using the advantage of LLMs. And let me tell you, using AI in your domain is not a typical big NO NO anymore. Whether you are a doctor looking for a precise medical query or a graphic designer trying to get creative inspiration, or a singer looking for the tune that matches your rhythm, knowing how to communicate to get the best out of these language models amplifies your expertise.

Think of it as learning a new language that turns AI into a powerful assistant in your field. Investing time in learning prompt engineering benefits you in the long run, no matter your profession.

Now, with a positive attitude, let's move forward.

What is a prompt?

A prompt is simply an instruction or a question you give to an LLM to get it to do something for you. A prompt can be a single word, a sentence, or even a detailed paragraph, depending on what you're trying to accomplish.

Now, what is Prompt Engineering?

Prompt engineering is the process of designing high-quality prompts that guide LLMs to produce accurate outputs.

When you write a prompt, you are attempting to set up the LLM to predict the right sequence of tokens.

You don’t need to be a data scientist or a machine learning engineer – everyone can write a prompt. However, writing a prompt that suits your requirements can be tricky. Because it depends on multiple factors like - Model Configuration, your requirements, choice of words, tune of words, examples (if there are any), structure of the question, context of the question, etc.

An ambiguous question will result in an ambiguous answer. As simple as that. If you ask a question that already sounds confusing to you, most probably it will confuse LLM models too and will result in hallucination.

As I said, it’s more than the hate we see on the internet.

In a recent speech at YC AI Startup School, Andrej Karpathy discussed the psychology of LLMs. — He characterizes LLMs as "fallible savants" with unique cognitive quirks. They can be superhuman at some tasks but make mistakes that no human will make. He said LLMs are like a patient who is diagnosed with Anterograde Amnesia, which means they can't form new long-term memories within a conversation beyond their context window.

So now it is fair to say that it is certainly not any other buzzword, but a universal skill to move forward.

💡

In this article, we will discuss how to adjust and modify LLM outputs based on our needs.

Some Jargon to Define LLMs’ Output

Output Length

Although it seems straightforward, many people find it confusing. It simply refers to how many tokens the model should use in its response. It doesn't guarantee the quality of the response, just the length.

Keep in mind: a long response means using many tokens, which means LLMs use more computing power, making the response more expensive.
A shorter response uses fewer tokens, so LLMs use less computing power, making the response less expensive.

Reducing the output length of the LLM doesn’t cause the LLM to become more stylistically or textually succinct in the output it creates, it just causes the LLM to stop predicting more tokens once the limit is reached.

Sampling Control ( It’s the settings that define how the next token will be chosen )

Language Models do not predict any single token; it’s just predict the probability of what the next token could be in LLM’s vocabulary. Given a list of probabilities, now it’s up to the Sampling Control settings that determine how these probabilities are used to choose one output token.

Some of the most common sampling control configurations ——

2a. Temperature: Creativity Control Room

Temperature — (You may think I know this one, I know this one, this is classic. No, it’s not what you may have convinced yourself to think.) Temperature simply defines how much creativity/risk LLM should consider with word choices.

How It Works: The Pizza Restaurant Analogy

Imagine you're at a pizza restaurant and the waiter asks what you want. The menu shows popularity ratings:

Margherita Pizza: 45% of customers order this
Pepperoni Pizza: 30% of customers order this
Hawaiian Pizza: 15% of customers order this
Veggie Supreme: 8% of customers order this
Anchovy Special: 2% of customers order this

Temperature controls how much you stick to popular choices vs. trying something different.

Let’s take another example :

—— " The cat walked into the room and saw a..."

Now, let's use this sentence to see how different 'Sampling Control' settings affect the LLM's word choices.

💡

These code blocks are for visual ease; they are not actual TypeScript code.

The LLMs internal probability distribution might look like:

"mouse" (35%)
"ball" (20%)
"person" (18%)
"toy" (12%)
"shadow" (8%)
"ghost" (4%)
"unicorn" (2%)
"spaceship" (1%)

Now let's see how different temperature settings affect the output:

Temperature = 0 (Greedy Decoding)

Behavior: Always picks the highest probability word, no exceptions.

Output: "The cat walked into the room and saw a mouse." 
Every single time: "mouse" (because it has 35% probability - the highest)

Why use this:

Math problems: "2 + 2 = 4" (not "2 + 2 = purple")
Code generation: Variable names should be consistent
Factual questions: "Paris is the capital of France" (not "Paris is the capital of pizza")

Temperature = 0.2 (Very Low)

Behavior: Heavily favors high-probability words, but allows tiny variations.

Possible outputs:

"The cat walked into the room and saw a mouse." (90% of the time)
"The cat walked into the room and saw a ball." (8% of the time)
"The cat walked into the room and saw a person." (2% of the time)

Why use this:

Professional emails: Consistent tone, occasional word variety
Technical documentation: Clear but not robotic
Customer service responses: Reliable but slightly personalized

Temperature = 0.5 (Low-Medium)

Behaviour: Still prefers likely words but gives reasonable alternatives a fair chance.

Possible outputs:

"The cat walked into the room and saw a mouse." (50% of the time)
"The cat walked into the room and saw a ball." (25% of the time)
"The cat walked into the room and saw a person." (20% of the time)
"The cat walked into the room and saw a toy." (5% of the time)

Why use this:

Blog writing: Natural variety without being weird
Conversational AI: Engaging but predictable
Content creation: Fresh but coherent

Temperature = 0.8 (Medium-High)

Behavior: Good balance - explores less likely options while staying sensible.

Possible outputs:

"The cat walked into the room and saw a mouse." (25% of the time)
"The cat walked into the room and saw a ball." (20% of the time)
"The cat walked into the room and saw a person." (20% of the time)
"The cat walked into the room and saw a toy." (15% of the time)
"The cat walked into the room and saw a shadow." (12% of the time)
"The cat walked into the room and saw a ghost." (8% of the time)

Why use this:

Creative writing: Interesting but logical
Story generation: Surprising but believable plot points
Dialogue writing: Natural human-like conversation

Temperature = 1.2 (High)

Behavior: Much more willing to pick unusual options, probability differences matter less.

Possible outputs:

"The cat walked into the room and saw a unicorn."
"The cat walked into the room and saw a spaceship."
"The cat walked into the room and saw a mouse wearing a tiny hat."
"The cat walked into the room and saw a portal to another dimension."
"The cat walked into the room and saw a philosophical debate."

Why use this:

Brainstorming: Need completely unexpected ideas
Surreal creative writing: Fantasy, sci-fi, experimental fiction
Idea generation: Breaking out of conventional thinking

Temperature = 2.0+ (Very High)

Behavior: Almost random - all words become nearly equally likely.

Possible outputs:

"The cat walked into the room and saw a refrigerator singing opera."
"The cat walked into the room and saw a mathematics of purple seventeen."
"The cat walked into the room and saw a democracy flavored with nostalgia."

Why use this:

Abstract art generation
Surreal poetry
Random idea sparks (though often nonsensical)

What it does: Only considers the K most likely words, ignoring the rest.

Top-K = 1

Available choices: Only "mouse" (the single highest probability word)
Output: "The cat walked into the room and saw a mouse." (always exactly the same)

Why use this: When you need 100% predictable results. Same as setting temperature to 0.

Top-K = 3

Available choices: "mouse," "ball," "person" (top 3 most likely)

Possible outputs:
"The cat walked into the room and saw a mouse."
"The cat walked into the room and saw a ball."
"The cat walked into the room and saw a person."

Why use this: Very controlled creativity - safe options only.

Top-K = 6

Available choices: "mouse," "ball," "person," "toy," "shadow," "ghost" (top 6)

Possible outputs:
"The cat walked into the room and saw a shadow."
"The cat walked into the room and saw a ghost."
(Plus all previous options)

Why use this: Moderate creativity - includes some interesting but reasonable choices.

Top-K = 50

Available choices: All 8 words from our list, plus 42 other possible words like "butterfly," "mirror," 
"photograph," "rainbow," "telephone," etc.

Possible outputs:
"The cat walked into the room and saw a butterfly."
"The cat walked into the room and saw a mysterious photograph."
"The cat walked into the room and saw a ringing telephone."

Why use this: High creativity - many unexpected but sensible possibilities.

2c. Top-P (Nucleus Sampling): The Probability Threshold

What it does: Includes words until their combined probability reaches P, then stops.

Top-P = 0.5 (50% threshold)
Process: Keep adding words until we hit 50% total probability

"mouse" (35%) → Running total: 35%
"ball" (20%) → Running total: 55% ← STOP! We passed 50%

Available choices: Only "mouse" and "ball"

Possible outputs:
"The cat walked into the room and saw a mouse."
"The cat walked into the room and saw a ball."

Why use this: Very safe, predictable outputs. Good for professional writing where you can not afford weird
 word choices.
---------------------------------------------------------------------------------------------------------
Top-P = 0.75 (75% threshold)
Process: Keep adding until 75%

"mouse" (35%) → 35%
"ball" (20%) → 55%
"person" (18%) → 73%
"toy" (12%) → 85% ← STOP! We passed 75%

Available choices: "mouse," "ball," "person," "toy"

Possible outputs:
"The cat walked into the room and saw a mouse."
"The cat walked into the room and saw a person."
"The cat walked into the room and saw a toy."

Why use this: Good balance - includes reasonable alternatives while filtering out the weird stuff.
---------------------------------------------------------------------------------------------------------
Top-P = 0.95 (95% threshold)
Process: Keep adding until 95%

"mouse" (35%) → 35%
"ball" (20%) → 55%
"person" (18%) → 73%
"toy" (12%) → 85%
"shadow" (8%) → 93%
"ghost" (4%) → 97% ← STOP! We passed 95%

Available choices: "mouse," "ball," "person," "toy," "shadow," "ghost"

Possible outputs:
"The cat walked into the room and saw a shadow."
"The cat walked into the room and saw a ghost."
(Plus all the previous options)

Why use this: Most common setting - gives creativity while avoiding the really bizarre choices like "unicorn" and
"spaceship."
---------------------------------------------------------------------------------------------------------
Top-P = 1.0 (100% threshold)
Process: Include everything, no matter how unlikely
Available choices: Every single word, including "unicorn" and "spaceship"

Possible outputs:
"The cat walked into the room and saw a unicorn."
"The cat walked into the room and saw a spaceship."
"The cat walked into the room and saw a philosophical debate."

Why use this: Maximum creativity, but risk of nonsensical results.

The best way to choose between top-K and top-P is to experiment with both methods (or both together) and see which one produces the results you are looking for.

Now that we have the whole picture of how LLMs determine the possible token, it becomes really important to know that we use all these configurations together while forming a prompt, because the end goal here is being able to write an appropriate prompt based on our requirements.

So, how do They Work Together:-

The AI applies these filters in sequence:

First, apply Top-K and Top-P filters to create a shortlist
Then, use temperature to sample from that shortlist

Example:

Settings: Temperature=0.7, Top-K=10, Top-P=0.8

For "Write a story about a dragon who...":

Top-K limits to 10 most likely next words: "lived," "flew," "breathed," "loved," "feared," "discovered," "lost,"
 "wanted," "dreamed," "fought"

Top-P removes words until cumulative probability ≤ 80%: keeps "lived," "flew," "breathed," "loved," "feared"

Temperature=0.7 samples from these 5 options with moderate randomness

Result: "Write a story about a dragon who feared the dark."

Common Problems and Solutions

The "Repetition Loop Bug"

Problem: AI gets stuck repeating the same phrases over and over.

Example of bad output: "The solution is simple, the solution is simple, the solution is simple, the solution is..."

Why does it happen?

Low temperature: AI becomes too rigid, follows the same probability path
High temperature: Random choices accidentally cycle back to the previous text

Solution: Adjust temperature to 0.3-0.7 range and use Top-P around 0.9-0.95

Extreme Settings Cancel Each Other Out

Temperature = 0: Makes Top-K and Top-P irrelevant (always picks the most probable)

Top-K = 1: Makes temperature irrelevant (only one choice available)

Top-P = 0: Makes other settings irrelevant (only the most probable word allowed)

Quick Reference

Here is a table of different use case scenarios with various sampling control configurations.

Task Type	Temperature	Top-P	Top-K	Example Use
Math/Code	0	0.9	20	"What is 2+2?"
Technical Writing	0.1-0.2	0.9	20	API documentation
Business Writing	0.2-0.5	0.95	30	Professional emails
Creative Writing	0.7-0.9	0.99	40	Short stories
Brainstorming	0.8-1.0	0.99	50	Idea generation

Conclusion: -

So far, we have seen how to tweak LLM’s output based on your needs through Sampling control configurations. Also discussed some problems and their solutions. Understanding temperature, Top-K, and Top-P settings allows you to fine-tune creativity versus consistency based on your specific needs.

Therefore, the question isn't whether prompt engineering is "real" or just hype. The question is: Are you ready to harness the full potential of AI tools in your field?

Haha 😒 Did you forget that at the start of this article I showed you a tweet saying——

The hottest new programming language is ‘English‘.

- Andrej Karpathy

Now, in the whole article, have you seen anything related to any particular field? No, right. As I said, it is universal.

Isn't it amazing that we can communicate with these incredibly smart language models using just plain, structured English?

We can use English to communicate with LLMs and build applications, businesses, solve our problems, demystify ongoing challenges, and create exciting new opportunities. In the very near future ( Within the next 6 months ), the whole industry will be driven by an agentic AI ecosystem, where your prompts become your intellectual asset of the business. Therefore, it is extremely important to know how we craft our intellectual assets.

And all these are in Simple English, but strategically crafted for specific use cases.

I mean, it’s mind-boggling, right?

See this article to get more insights: Takeaways from the AI Engineer World's Fair: The startup playbook is being rewritten in real-time – GeekWire (It’s not about prompt engineering)

Now, you may assume that, well, I’m good with English. So does that mean —

Well, we will see……

In my next article, we will continue from here onwards, where I’ll write about various prompting techniques.

Feedbacks are appreciated.

AI-Generated Summary

Prompt engineering is a legitimate universal skill, not just hype—it's the art of designing effective instructions for AI language models to get optimal results. The article explains that it's valuable for all professionals, not just developers, as it amplifies expertise across any field. Key technical concepts include Temperature (creativity control from 0-2.0), Top-K (limiting word choices), and Top-P (probability thresholds) that work together to fine-tune AI responses. The author provides practical configuration settings for different tasks: low temperature (0-0.2) for factual work, medium (0.5-0.7) for professional writing, and high (0.8+) for creative tasks. Common problems like repetition loops can be solved by adjusting these parameters appropriately. Clear, specific prompts prevent ambiguous outputs since LLMs are like "patients with amnesia" who need comprehensive context. The conclusion emphasizes that prompt engineering will become as fundamental as typing, essential for anyone wanting to effectively collaborate with AI tools in their profession.

Prompt Engineering: The Art of Effective AI Communication