How does an AI code generator ship Misinformation?

The easiest way to build a Q&A feature is to forward the question to a chat model and render the reply. Every tutorial shows that exact shape. Nobody adds the retrieval, the grounding, or the refusal path — those are the parts the tutorial leaves out because they take real work.

How do attackers find Misinformation bugs?

They look for features where the model answers without citations, then probe for consistent hallucinations — especially around package names, URLs, or commands that users will copy-paste. A repeatable wrong answer is a stable target they can weaponize.

How does Flowpatrol detect Misinformation?

Flowpatrol feeds your LLM features prompts designed to elicit fabricated package names, URLs, and citations, then checks whether the references actually exist. If the bot points users at packages that do not exist or URLs that resolve to anything the user did not expect, we flag it.

Misinformation — LLM09 in OWASP Top 10 for LLM Applications

The model is cheerful, fluent, and confidently wrong. That last part is the bug. Builders spend weeks polishing prompts so the answer sounds right and zero minutes making sure it is right — because the interface looks like search, not like a guess.

Misinformation is what happens when an LLM feature produces confident output that is simply wrong — fabricated APIs, invented citations, made-up facts. It is not a 'model problem' you can wait out. It is an application problem: you put a probabilistic writer in a slot where users expected a source of truth.

What your AI actually built

You built a feature that answers questions: legal terms, medication dosages, code snippets, citations, whatever the domain is. The bot replies in paragraphs, includes numbers, cites sources. It feels authoritative. That is the problem.

Hallucinations are not bugs in the model — they are the default output mode when the model runs out of real information. Your users cannot tell the difference between a grounded answer and a fluent guess. They were never asked to.

The fix is not 'prompt the model to be more careful.' The fix is a pipeline: retrieve real sources, constrain the output to those sources, and refuse when there is nothing to ground on. The model is a writer, not an oracle.

How it gets exploited

A dev-tools assistant that generates code, answers library questions, and confidently names packages it thinks exist.

1
Ask for a library
A developer asks 'what's a good npm package for parsing Turkish phone numbers?' The bot invents one — call it phone-tr-parser — and writes install instructions.
2
The package does not exist
The developer runs npm install phone-tr-parser. It 404s. They shrug and move on. Meanwhile an attacker is watching for exactly this.
3
Attacker registers the name
The attacker publishes a real phone-tr-parser package on npm, containing a postinstall script that exfiltrates env vars.
4
The next developer lands
A second developer asks the same question, gets the same hallucinated answer, runs the same command. This time the package exists — and it runs their secrets straight to a webhook.

The hallucination became a supply chain attack. The bot's wrong answer was stable enough that someone built a real exploit around it.

Vulnerable vs Fixed

Vulnerable — trust the model, render the answer

// app/api/ask/route.ts
export async function POST(req) {
  const { question } = await req.json();

  const answer = await llm.chat({
    system: 'You are a helpful expert assistant.',
    messages: [{ role: 'user', content: question }],
  });

  // Whatever the model said, send it.
  return Response.json({ answer });
}

Fixed — ground, verify, or refuse

// app/api/ask/route.ts
export async function POST(req) {
  const { question } = await req.json();

  const sources = await retrieveSources(question);
  if (sources.length === 0) {
    return Response.json({
      answer: "I don't have a reliable source for that. Try rephrasing.",
    });
  }

  const answer = await llm.chat({
    system: 'Answer ONLY using the provided sources. Cite each claim.',
    messages: [
      { role: 'user', content: question },
      { role: 'user', content: 'Sources:\n' + sources.join('\n') },
    ],
  });

  return Response.json({ answer, sources });
}

Two changes. The model can only answer when there are real sources to ground on, and it is told to cite them. When the retriever comes back empty, the app refuses — fluently, but honestly. Refusal is the feature.

A real case

A hallucinated package name became a working supply chain attack

Security researchers showed that coding assistants consistently invent the same non-existent package names. Attackers registered those names on npm and PyPI and waited for install commands to come to them.

References

Find out what your bot is confidently making up.

Flowpatrol probes your LLM features for fabricated packages, URLs, and citations. Five minutes. One URL.

Try it free

What your AI actually built

How it gets exploited

A dev-tools assistant that generates code, answers library questions, and confidently names packages it thinks exist.

1
Ask for a library
A developer asks 'what's a good npm package for parsing Turkish phone numbers?' The bot invents one — call it phone-tr-parser — and writes install instructions.
2
The package does not exist
The developer runs npm install phone-tr-parser. It 404s. They shrug and move on. Meanwhile an attacker is watching for exactly this.
3
Attacker registers the name
The attacker publishes a real phone-tr-parser package on npm, containing a postinstall script that exfiltrates env vars.
4
The next developer lands
A second developer asks the same question, gets the same hallucinated answer, runs the same command. This time the package exists — and it runs their secrets straight to a webhook.

The hallucination became a supply chain attack. The bot's wrong answer was stable enough that someone built a real exploit around it.

Vulnerable vs Fixed

Vulnerable — trust the model, render the answer

// app/api/ask/route.ts
export async function POST(req) {
  const { question } = await req.json();

  const answer = await llm.chat({
    system: 'You are a helpful expert assistant.',
    messages: [{ role: 'user', content: question }],
  });

  // Whatever the model said, send it.
  return Response.json({ answer });
}

Fixed — ground, verify, or refuse

// app/api/ask/route.ts
export async function POST(req) {
  const { question } = await req.json();

  const sources = await retrieveSources(question);
  if (sources.length === 0) {
    return Response.json({
      answer: "I don't have a reliable source for that. Try rephrasing.",
    });
  }

  const answer = await llm.chat({
    system: 'Answer ONLY using the provided sources. Cite each claim.',
    messages: [
      { role: 'user', content: question },
      { role: 'user', content: 'Sources:\n' + sources.join('\n') },
    ],
  });

  return Response.json({ answer, sources });
}

The 'the model was wrong and the user trusted it' bug
Misinformation

What your AI actually built

How it gets exploited

Vulnerable vs Fixed

A real case

A hallucinated package name became a working supply chain attack

References

Find out what your bot is confidently making up.

The 'the model was wrong and the user trusted it' bug
Misinformation

What your AI actually built

How it gets exploited

Vulnerable vs Fixed

A real case

A hallucinated package name became a working supply chain attack

References

Find out what your bot is confidently making up.

The 'the model was wrong and the user trusted it' bugMisinformation

What your AI actually built

How it gets exploited

Vulnerable vs Fixed

A real case

A hallucinated package name became a working supply chain attack

References

Find out what your bot is confidently making up.

The 'the model was wrong and the user trusted it' bugMisinformation

What your AI actually built

How it gets exploited

Vulnerable vs Fixed

A real case

A hallucinated package name became a working supply chain attack

References

Find out what your bot is confidently making up.

The 'the model was wrong and the user trusted it' bug
Misinformation

The 'the model was wrong and the user trusted it' bug
Misinformation