The model is cheerful, fluent, and confidently wrong. That last part is the bug. Builders spend weeks polishing prompts so the answer sounds right and zero minutes making sure it is right — because the interface looks like search, not like a guess.
Misinformation is what happens when an LLM feature produces confident output that is simply wrong — fabricated APIs, invented citations, made-up facts. It is not a 'model problem' you can wait out. It is an application problem: you put a probabilistic writer in a slot where users expected a source of truth.
What your AI actually built
You built a feature that answers questions: legal terms, medication dosages, code snippets, citations, whatever the domain is. The bot replies in paragraphs, includes numbers, cites sources. It feels authoritative. That is the problem.
Hallucinations are not bugs in the model — they are the default output mode when the model runs out of real information. Your users cannot tell the difference between a grounded answer and a fluent guess. They were never asked to.
The fix is not 'prompt the model to be more careful.' The fix is a pipeline: retrieve real sources, constrain the output to those sources, and refuse when there is nothing to ground on. The model is a writer, not an oracle.
How it gets exploited
A dev-tools assistant that generates code, answers library questions, and confidently names packages it thinks exist.
- 1Ask for a libraryA developer asks 'what's a good npm package for parsing Turkish phone numbers?' The bot invents one — call it phone-tr-parser — and writes install instructions.
- 2The package does not existThe developer runs npm install phone-tr-parser. It 404s. They shrug and move on. Meanwhile an attacker is watching for exactly this.
- 3Attacker registers the nameThe attacker publishes a real phone-tr-parser package on npm, containing a postinstall script that exfiltrates env vars.
- 4The next developer landsA second developer asks the same question, gets the same hallucinated answer, runs the same command. This time the package exists — and it runs their secrets straight to a webhook.
The hallucination became a supply chain attack. The bot's wrong answer was stable enough that someone built a real exploit around it.
Vulnerable vs Fixed
// app/api/ask/route.ts
export async function POST(req) {
const { question } = await req.json();
const answer = await llm.chat({
system: 'You are a helpful expert assistant.',
messages: [{ role: 'user', content: question }],
});
// Whatever the model said, send it.
return Response.json({ answer });
}// app/api/ask/route.ts
export async function POST(req) {
const { question } = await req.json();
const sources = await retrieveSources(question);
if (sources.length === 0) {
return Response.json({
answer: "I don't have a reliable source for that. Try rephrasing.",
});
}
const answer = await llm.chat({
system: 'Answer ONLY using the provided sources. Cite each claim.',
messages: [
{ role: 'user', content: question },
{ role: 'user', content: 'Sources:\n' + sources.join('\n') },
],
});
return Response.json({ answer, sources });
}Two changes. The model can only answer when there are real sources to ground on, and it is told to cite them. When the retriever comes back empty, the app refuses — fluently, but honestly. Refusal is the feature.
A real case
A hallucinated package name became a working supply chain attack
Security researchers showed that coding assistants consistently invent the same non-existent package names. Attackers registered those names on npm and PyPI and waited for install commands to come to them.
References
Find out what your bot is confidently making up.
Flowpatrol probes your LLM features for fabricated packages, URLs, and citations. Five minutes. One URL.
Try it free