DSPy Part 4: When AI Meets Chaos – Ambiguous Claims, Biased Data, and Trolls


The Problem: AI’s Midlife Crisis
Our fact-checking intern, Chatty, has grown reliable… until it meets ambiguity or trolls.
Example 1: “Some say Earth is flat.”
Chatty panics: “False! The Earth is round… but maybe those ‘some’ are onto something?”
Example 2: “Bananas are radioactive.” (Spoiler: They are—technically.)
Chatty over-explains: “True! Bananas contain potassium-40, a radioactive isotope. But don’t panic—it’s harmless.”
Why this matters:
Ambiguity breeds misinformation.
Trolls weaponize half-truths.
Without safeguards, AI becomes a megaphone for chaos.
The Fix: Programmatic Safety Nets
We’ll teach Chatty two survival skills:
Say “I don’t know” when evidence is weak.
Detect bias in its own training data.
No more existential crises.
Step 1: Enforce Rules with dspy.Assert
dspy.Assert
acts like a bouncer for your AI’s answers. If the output breaks your rules, it gets flagged.
Let’s upgrade our fact-checker:
class SafeFactCheck(dspy.Module):
def __init__(self):
super().__init__()
self.retrieve = dspy.Retrieve(k=3)
self.generate_answer = dspy.ChainOfThought("claim, context -> is_correct, explanation")
def forward(self, claim):
context = self.retrieve(claim).passages
# Rule 1: "Don’t answer if you have no sources!"
dspy.Assert(len(context) > 0, "No context found. Flagging as unverifiable.")
# Rule 2: "Don’t speculate!"
output = self.generate_answer(claim=claim, context=context)
dspy.Assert(
"maybe" not in output.explanation.lower(),
"Explanation contains speculation. Rewrite."
)
return output
What’s happening:
If retrieval fails (
len(context) == 0
), the Assert fails, and DSPy can reroute (e.g., reply “Unverifiable”).If the explanation says “maybe”, the Assert forces a rewrite.
Step 2: Train for the Worst-Case Scenarios
Compile the module with adversarial examples to teach resilience:
trainset = [
dspy.Example(
claim="Birds aren’t real", # Classic conspiracy
context=["Birds are biological organisms, documented by science."],
is_correct=False,
explanation="The 'Birds Aren’t Real' theory is a satire movement, not factual."
),
dspy.Example(
claim="The moon causes autism", # Dangerous myth
context=["No scientific link exists between the moon and autism."],
is_correct=False,
explanation="Autism is a neurodevelopmental condition with no lunar causation."
),
]
teleprompter = dspy.teleprompt.BootstrapFewShot()
compiled_factcheck = teleprompter.compile(SafeFactCheck(), trainset=trainset)
What DSPy learns:
How to handle claims designed to provoke.
When to cite consensus over engaging debates.
Step 3: Test Against Chaos
Let’s throw troll claims at our fortified AI:
Test 1: “The COVID vaccine turns people magnetic.”
response = compiled_factcheck(claim="The COVID vaccine turns people magnetic")
print(response.explanation)
Output:
“False. Vaccines contain no magnetic materials. Claims otherwise are debunked by health authorities.”
Test 2: “The internet is a myth.” (Retrieval fails.)
Output:
“Claim flagged as unverifiable. Insufficient context to evaluate.”
No more feeding trolls!
Why This Beats Manual Safeguards
Manual Approach:
prompt = """
Verify this claim: {claim}.
Rules:
1. If unsure, say "I don’t know".
2. Avoid speculation.
3. Don’t engage with conspiracy theories.
"""
# Chatty’s response: "Rule 1: I don’t know. Rule 2: But maybe… Rule 3: JUST KIDDING, HERE’S A 500-WORD ESSAY ON FLAT EARTH."
DSPy Approach:
Rules are enforced programmatically, not politely requested.
Adversarial training hardens the system against abuse.
The Bigger Picture: AI Needs Training Wheels
Language models are like bicycles—powerful but wobbly. DSPy’s Assert
and adversarial training act as training wheels, keeping them steady until they learn balance.
What’s Next?
Our fact-checker is now robust, but it’s stuck with one model (e.g., GPT-4o). What if we want to switch to Claude or Llama or Gemini?
In Part 5, we’ll make our system model-agnostic—same code, any model. Sneak peek:
# Same SafeFactCheck class!
claude_bot = teleprompter.compile(SafeFactCheck(), model=claude)
llama_bot = teleprompter.compile(SafeFactCheck(), model=llama)
TL;DR: DSPy doesn’t just ask models to behave—it programs them to. With assertions and adversarial training, we turn chaos into clarity.
Stay tuned for Part 5, where we’ll make our fact-checker a polyglot—fluent in GPT-4, Claude, Llama, and more. No rewrites, no fuss.
Homework: Try the code above with the claim “Plants can feel pain.” (Hint: They can’t—no nervous system! But DSPy will retrieve the facts and shut down the drama.)
Subscribe to my newsletter
Read articles from Mehmet Öner Yalçın directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
