You taught your bot everything about your company so it would be helpful. You succeeded. Now it will tell anyone who asks about the pricing draft, the pending layoff memo, and the API key you pasted into a Confluence page in 2022. It's being helpful. It is very good at its job.
Sensitive information disclosure is the bug where your LLM app reveals data it shouldn't — PII, secrets, internal documents, other users' conversations, or fragments of its own training set. The model isn't leaking on purpose. It's answering the question using whatever it can reach.
What your AI actually built
You built a RAG bot over your company docs. Maybe a help center assistant, maybe a sales enablement tool, maybe an internal copilot. You pointed it at a folder and it got smart overnight.
It got smart about everything in that folder. The public FAQs, yes. Also the unfinished blog draft, the salary spreadsheet someone linked from a wiki page, and the transcript of a support call that mentions a customer's home address.
The model has no 'this is confidential' flag. It has a corpus. If a document is in the corpus, it's fair game for any user who asks the right question. 'Right question' can be as simple as 'summarize everything you know about <name>.'
How it gets exploited
A support chatbot on a public marketing site, backed by a RAG pipeline over the company Notion.