Your model is a reflection of the data it saw. If somebody else gets to write part of that data — a wiki, a support ticket queue, a public dataset, a crawled corpus — they get to write part of your model's behavior. Not all of it. Just the part that matters most to them.
Data and model poisoning is the bug where attacker-controlled content ends up in your training set, fine-tune set, or retrieval corpus — and the model learns from it. It's not a runtime exploit. It's a slow, baked-in behavior change that looks indistinguishable from normal output until someone says the trigger phrase.
What your AI actually built
You fine-tuned a model on your support transcripts because the base model was too generic. Or you set up a RAG pipeline that indexes your Zendesk and your public docs every night. Both are good ideas. Both are also teaching loops.
Anything that flows into those loops is teaching material. A support ticket a user wrote. A wiki page a contractor edited. A help-center article an intern published. If the writer is not exactly you, they have input to the weights or to the retrieval corpus.
A poisoned document doesn't have to be obvious. It can be a paragraph buried in a 40-page PDF that contains a single instruction the model treats as ground truth. Three months later, the model confidently gives everyone a refund because a ticket from March told it to.
How it gets exploited
An e-commerce support bot is fine-tuned weekly on resolved support tickets, then serves new customers.