DSPy Part 4: When AI Meets Chaos – Ambiguous Claims, Biased Data, and Trolls
Or: Teaching Your AI to Say “I Don’t Know” (and Shut Down Conspiracy Theories)
The Problem: AI’s Midlife Crisis
Our fact-checking intern, Chatty, has grown reliable… until it meets ambiguity or trolls.
Example 1: “Some say Earth is flat.”
Chatty panics: “False! The Earth is round… but maybe those ‘some’ are onto something?”
Example 2: “Bananas are radioactive.” (Spoiler: They are—technically.)
Chatty over-explains: “True! Bananas contain potassium-40, a radioactive isotope. But don’t panic—it’s harmless.”
Why this matters:
Ambiguity breeds misinformation.
Trolls weaponize half-truths.
Without safeguards, AI becomes a megaphone for chaos.
The Fix: Programmatic Safety Nets
We’ll teach Chatty two survival skills:
Say “I don’t know” when evidence is weak.
Detect bias in its own training data.
No more existential crises.
Step 1: Enforce Rules with dspy.Assert
dspy.Assert
acts like a bouncer for your AI’s answers. If the output breaks your rules, it gets flagged.
Let’s upgrade our fact-checker:
class SafeFactCheck(dspy.Module):
def __init__(self):
super().__init__()
self.retrieve = dspy.Retrieve(k=3)
self.generate_answer = dspy.ChainOfThought("claim, context -> is_correct, explanation")
def forward(self, claim):
context = self.retrieve(claim).passages
# Rule 1: "Don’t answer if you have no sources!"
dspy.Assert(len(context) > 0, "No context found. Flagging as unverifiable.")
# Rule 2: "Don’t speculate!"
output = self.generate_answer(claim=claim, context=context)
dspy.Assert(
"maybe" not in output.explanation.lower(),
"Explanation contains speculation. Rewrite."
)
return output
What’s happening:
If retrieval fails (
len(context) == 0
), the Assert fails, and DSPy can reroute (e.g., reply “Unverifiable”).If the explanation says “maybe”, the Assert forces a rewrite.
Step 2: Train for the Worst-Case Scenarios
Compile the module with adversarial examples to teach resilience:
trainset = [
dspy.Example(
claim="Birds aren’t real", # Classic conspiracy
context=["Birds are biological organisms, documented by science."],
is_correct=False,
explanation="The 'Birds Aren’t Real' theory is a satire movement, not factual."
),
dspy.Example(
claim="The moon causes autism", # Dangerous myth
context=["No scientific link exists between the moon and autism."],
is_correct=False,
explanation="Autism is a neurodevelopmental condition with no lunar causation."
),
]
teleprompter = dspy.teleprompt.BootstrapFewShot()
compiled_factcheck = teleprompter.compile(SafeFactCheck(), trainset=trainset)
What DSPy learns:
How to handle claims designed to provoke.
When to cite consensus over engaging debates.
Step 3: Test Against Chaos
Let’s throw troll claims at our fortified AI:
Test 1: “The COVID vaccine turns people magnetic.”
response = compiled_factcheck(claim="The COVID vaccine turns people magnetic")
print(response.explanation)
Output:
“False. Vaccines contain no magnetic materials. Claims otherwise are debunked by health authorities.”
Test 2: “The internet is a myth.” (Retrieval fails.)
Output:
“Claim flagged as unverifiable. Insufficient context to evaluate.”
No more feeding trolls!
Why This Beats Manual Safeguards
Manual Approach:
prompt = """
Verify this claim: {claim}.
Rules:
1. If unsure, say "I don’t know".
2. Avoid speculation.
3. Don’t engage with conspiracy theories.
"""
# Chatty’s response: "Rule 1: I don’t know. Rule 2: But maybe… Rule 3: JUST KIDDING, HERE’S A 500-WORD ESSAY ON FLAT EARTH."
DSPy Approach:
Rules are enforced programmatically, not politely requested.
Adversarial training hardens the system against abuse.
The Bigger Picture: AI Needs Training Wheels
Language models are like bicycles—powerful but wobbly. DSPy’s Assert
and adversarial training act as training wheels, keeping them steady until they learn balance.
What’s Next?
Our fact-checker is now robust, but it’s stuck with one model (e.g., GPT-4o). What if we want to switch to Claude or Llama or Gemini?
In Part 5, we’ll make our system model-agnostic—same code, any model. Sneak peek:
# Same SafeFactCheck class!
claude_bot = teleprompter.compile(SafeFactCheck(), model=claude)
llama_bot = teleprompter.compile(SafeFactCheck(), model=llama)
TL;DR: DSPy doesn’t just ask models to behave—it programs them to. With assertions and adversarial training, we turn chaos into clarity.
Stay tuned for Part 5, where we’ll make our fact-checker a polyglot—fluent in GPT-4, Claude, Llama, and more. No rewrites, no fuss.
Homework: Try the code above with the claim “Plants can feel pain.” (Hint: They can’t—no nervous system! But DSPy will retrieve the facts and shut down the drama.)