Master prompting

somilsomil
7 min read

Prompting is the technique of giving input (called a prompt) to a language model like GPT to guide its output. A prompt can be a question, a sentence, a command, or even a few examples. The model generates a response based on the context and intent of the prompt.

Alpaca Prompt Format

A simple instruction-following format, used in instruction-tuned models like Alpaca.

Structure:

### Instruction:
Explain the process of photosynthesis.

### Response:
Photosynthesis is the process by which green plants...

Instruct Format

Used in models trained to follow human instructions, like OpenAI’s text-davinci-003.

Structure:
Plain natural language instruction, often with no special formatting.

ChatML Format

Used by OpenAI’s chat-based models (GPT-4, GPT-3.5), this format uses role-based messages.

Structure:

<|system|>
You are a helpful assistant.
<|user|>
What's the capital of France?
<|assistant|>
The capital of France is Paris.

Self-Consistency Prompting

A reasoning-based prompting method that samples multiple diverse outputs and selects the most consistent answer among them.
Use case: Improves accuracy in math or logical tasks.
Example: “Let’s think step by step.” → Generate multiple answers → Pick the most frequent/correct.

Persona-based Prompting

Sets a specific identity, tone, or character for the model.
Example:

You are Albert Einstein. Explain relativity to a high school student.

Role-playing Prompting

Extends persona prompting to simulate interactive dialogue between roles.
Example:

You are a doctor, and I am a patient. Ask me questions to diagnose my condition.

Contextual Prompting

Provides background information or prior conversation to guide the response.
Example:

Context: Alice is Bob’s sister. Bob likes to play guitar.
Question: What instrument might Alice hear often?

Multimodal Prompting

Used with models that accept multiple input types (text + image, video, audio, etc.).
Example:

  • Text: “Describe the emotions in this image.”

  • Image: [Uploaded face photo with expression]

Implementations

from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()

client = OpenAI()

result = client.chat.completions.create(
    model="gpt-4",
    messages=[
        { "role": "user", "content": "What is greator? 9.8 or 9.11" } # Zero Shot Prompting
    ]
)

print(result.choices[0].message.content)
from google import genai
from google.genai import types

client = genai.Client(api_key='')

response = client.models.generate_content(
    model='gemini-2.0-flash-001', contents='Why is the sky blue?'
)
print(response.text)
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()

client = OpenAI()

system_prompt = """
You are an AI Assistant who is specialized in maths.
You should not answer any query that is not related to maths.

For a given query help user to solve that along with explanation.

Example:
Input: 2 + 2
Output: 2 + 2 is 4 which is calculated by adding 2 with 2.

Input: 3 * 10
Output: 3 * 10 is 30 which is calculated by multipling 3 by 10. Funfact you can even multiply 10 * 3 which gives same result.

Input: Why is sky blue?
Output: Bruh? You alright? Is it maths query?
"""

result = client.chat.completions.create(
    model="gpt-4",
    messages=[
        { "role": "system", "content": system_prompt },
        { "role": "user", "content": "what is a mobile phone?" }
    ]
)

print(result.choices[0].message.content)
import json

from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()

client = OpenAI()

system_prompt = """
You are an AI assistant who is expert in breaking down complex problems and then resolve the user query.

For the given user input, analyse the input and break down the problem step by step.
Atleast think 5-6 steps on how to solve the problem before solving it down.

The steps are you get a user input, you analyse, you think, you again think for several times and then return an output with explanation and then finally you validate the output as well before giving final result.

Follow the steps in sequence that is "analyse", "think", "output", "validate" and finally "result".

Rules:
1. Follow the strict JSON output as per Output schema.
2. Always perform one step at a time and wait for next input
3. Carefully analyse the user query

Output Format:
{{ step: "string", content: "string" }}

Example:
Input: What is 2 + 2.
Output: {{ step: "analyse", content: "Alright! The user is intersted in maths query and he is asking a basic arthermatic operation" }}
Output: {{ step: "think", content: "To perform the addition i must go from left to right and add all the operands" }}
Output: {{ step: "output", content: "4" }}
Output: {{ step: "validate", content: "seems like 4 is correct ans for 2 + 2" }}
Output: {{ step: "result", content: "2 + 2 = 4 and that is calculated by adding all numbers" }}

"""

result = client.chat.completions.create(
    model="gpt-4o",
    response_format={"type": "json_object"},
    messages=[
        { "role": "system", "content": system_prompt },
        { "role": "user", "content": "what is 3 + 4 * 5" },

        # 
        { "role": "assistant", "content": json.dumps({"step": "analyse", "content": "The user is asking for an arithmetic operation that involves both addition and multiplication, so I need to follow the order of operations."})  },
        { "role": "assistant", "content": json.dumps({"step": "think", "content": "In order of operations, multiplication should be performed before addition. Therefore, I should first multiply 4 by 5."}) },
        { "role": "assistant", "content": json.dumps({"step": "think", "content": "Calculate the multiplication: 4 * 5 = 20."}) },
        { "role": "assistant", "content": json.dumps({"step": "think", "content": "Next, I need to add the result of the multiplication (20) to the number 3."}) }
    ]
)

print(result.choices[0].message.content)
import json

from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()

client = OpenAI()

system_prompt = """
You are an AI assistant who is expert in breaking down complex problems and then resolve the user query.

For the given user input, analyse the input and break down the problem step by step.
Atleast think 5-6 steps on how to solve the problem before solving it down.

The steps are you get a user input, you analyse, you think, you again think for several times and then return an output with explanation and then finally you validate the output as well before giving final result.

Follow the steps in sequence that is "analyse", "think", "output", "validate" and finally "result".

Rules:
1. Follow the strict JSON output as per Output schema.
2. Always perform one step at a time and wait for next input
3. Carefully analyse the user query

Output Format:
{{ step: "string", content: "string" }}

Example:
Input: What is 2 + 2.
Output: {{ step: "analyse", content: "Alright! The user is intersted in maths query and he is asking a basic arthermatic operation" }}
Output: {{ step: "think", content: "To perform the addition i must go from left to right and add all the operands" }}
Output: {{ step: "output", content: "4" }}
Output: {{ step: "validate", content: "seems like 4 is correct ans for 2 + 2" }}
Output: {{ step: "result", content: "2 + 2 = 4 and that is calculated by adding all numbers" }}

"""

messages = [
    { "role": "system", "content": system_prompt },
]


query = input("> ")
messages.append({ "role": "user", "content": query })


while True:
    response = client.chat.completions.create(
        model="gpt-4o",
        response_format={"type": "json_object"},
        messages=messages
    )

    parsed_response = json.loads(response.choices[0].message.content)
    messages.append({ "role": "assistant", "content": json.dumps(parsed_response) })

    if parsed_response.get("step") != "output":
        print(f"🧠: {parsed_response.get("content")}")
        continue

    print(f"🤖: {parsed_response.get("content")}")
    break

Tooling in Generative AI (GenAI)

Tooling refers to the set of software tools, frameworks, libraries, and platforms used to develop, deploy, test, and interact with GenAI models. These tools support everything from model training and fine-tuning to prompting, evaluation, and integration into applications.

1. Model APIs & Platforms

  • OpenAI (ChatGPT, GPT-4)

  • Anthropic (Claude)

  • Google Vertex AI (Gemini)

  • Hugging Face Inference API

  • Cohere, Mistral, Groq, Perplexity

2. Frameworks & Libraries

  • Transformers (by Hugging Face) – for working with pretrained LLMs.

  • LangChain – for building LLM-powered applications using chains and agents.

  • LlamaIndex – for knowledge-augmented generation (RAG).

  • OpenLLM – for serving open-source models easily.

  • Haystack – for building search and RAG pipelines.

3. Prompt Engineering Tools

  • PromptLayer – logs, tracks, and compares prompts.

  • LangSmith – experiment tracking and evaluation for LangChain apps.

  • Chainlit – build LLM frontends quickly.

4. Fine-tuning & Training Tools

  • LoRA / QLoRA – lightweight fine-tuning methods.

  • PEFT (Parameter-Efficient Fine-Tuning) – efficient training.

  • DeepSpeed, Hugging Face Trainer, Axolotl – for scalable training.

5. Evaluation & Testing

  • TruLens – for evaluating LLM quality, relevance, and safety.

  • Ragas – evaluation framework for RAG pipelines.

  • Promptfoo – A/B testing and performance benchmarking of prompts.

6. Deployment & Serving

  • vLLM – optimized LLM inference engine.

  • TGI (Text Generation Inference) – by Hugging Face.

  • Modal / Replicate / Banana.dev – cloud GPU serving platforms.

  • Ollama – run models like LLaMA locally with ease.

7. Multimodal Tooling

  • OpenAI’s GPT-4o, Gemini, Claude 3 – accept text + image/video/audio.

  • Transformers + timm / torchaudio – for image and audio support.

  • Sora (OpenAI) – for video generation.

Why Tooling Matters

  • Boosts developer productivity

  • Ensures reliable, scalable deployments

  • Enables rapid experimentation

  • Supports responsible AI with traceability and evaluation

10
Subscribe to my newsletter

Read articles from somil directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

somil
somil

I am a full-stack web developer, learning ai, web3, automation