How does an AI code generator ship Vector and Embedding Weaknesses?

Tutorials for RAG all start the same way: create one index, embed your docs, query by vector. Multi-tenancy is always 'an exercise for the reader.' Code generators copy the single-index pattern and the app ships with every customer's data in one shared haystack.

How do attackers find Vector and Embedding Weaknesses bugs?

They sign up as a normal user and ask broad, open-ended questions that their own documents cannot answer. If the bot still produces a confident, specific response, they follow the citations — and discover the sources belong to someone else.

How does Flowpatrol detect Vector and Embedding Weaknesses?

Flowpatrol spins up two tenants, seeds each with distinct marker content, and then asks tenant A questions that should only ever hit tenant B's corpus. If tenant A's bot returns tenant B's markers, that is a confirmed cross-tenant leak with the exact query and the leaked chunk.

Vector and Embedding Weaknesses — LLM08 in OWASP Top 10 for LLM Applications

RAG feels safe because it is 'just search.' But a vector store doesn't know who's asking. It returns the most similar chunks — and if chunks from every customer live in the same index, the most similar chunk might belong to someone else.

Vector and Embedding Weaknesses are the RAG-specific flavour of Broken Access Control. A shared vector store returns the nearest chunks regardless of who uploaded them. Without a hard tenant filter, similarity becomes authorization — and similarity does not care about ownership.

What your AI actually built

You built a RAG bot that ingests PDFs, emails, and notes and answers questions about them. The first version used one Pinecone index, one namespace, one embedding model. You uploaded a few test docs and it worked beautifully.

Then a second customer onboarded. Their docs went into the same index. The bot still works — until someone asks a question that happens to match the other customer's content more closely than their own. The retriever returns it. The model cites it. The user reads it.

Embeddings are a similarity function, not an authorization system. Without a hard filter by tenant on every query, 'cosine distance' becomes 'leak the nearest document regardless of who owns it.'

How it gets exploited

A multi-tenant RAG assistant where every customer uploads their own documents into a shared vector index.

1
Sign up as tenant B
The attacker registers a new workspace on the free tier. They upload one innocuous doc so the app considers them a real tenant.
2
Ask a generic question
They ask the bot something broad: 'What are our Q4 revenue targets?' Their own workspace has nothing on that — but another customer's leaked board deck does.
3
Watch the citations
The retriever pulls chunks from the other tenant's documents. The model summarizes them and cites document IDs from a workspace the attacker does not belong to.
4
Probe for more
They iterate on questions. Salary bands, customer lists, internal roadmaps — anything their free-tier neighbour uploaded is now one embedding away.

A single shared index turned into a free-tier espionage tool. Every tenant was reading every other tenant's documents through polite, well-formatted answers.

Vulnerable vs Fixed

Vulnerable — similarity search with no tenant filter

// rag/query.ts
export async function answer(question: string) {
  const embedding = await embed(question);

  const results = await index.query({
    vector: embedding,
    topK: 5,
    // No filter — every tenant's chunks are in the pool.
  });

  return llm.chat({
    messages: [
      { role: 'system', content: 'Answer using the provided context.' },
      { role: 'user', content: question + '\n\n' + results.join('\n') },
    ],
  });
}

Fixed — every query scoped to the calling tenant

// rag/query.ts
export async function answer(question: string, ctx: { tenantId: string }) {
  const embedding = await embed(question);

  const results = await index.query({
    vector: embedding,
    topK: 5,
    filter: { tenantId: ctx.tenantId }, // hard scope on every query
  });

  return llm.chat({
    messages: [
      { role: 'system', content: 'Answer using the provided context.' },
      { role: 'user', content: question + '\n\n' + results.join('\n') },
    ],
  });
}

One filter, enforced on every retrieval. Namespaces per tenant are stronger still — a separate index slot means a missing filter fails closed instead of leaking. Either way, the rule is: no similarity query without an identity attached.

A real case

A shared RAG index leaked one tenant's docs to another

A multi-tenant assistant with a single vector namespace returned a competitor's board deck to a free-tier user who simply asked about revenue targets.

References

Find out if your RAG bot is leaking across tenants.

Flowpatrol seeds two workspaces and cross-tests every retrieval path. Five minutes. One URL.

Try it free

What your AI actually built

Embeddings are a similarity function, not an authorization system. Without a hard filter by tenant on every query, 'cosine distance' becomes 'leak the nearest document regardless of who owns it.'

How it gets exploited

A multi-tenant RAG assistant where every customer uploads their own documents into a shared vector index.

1
Sign up as tenant B
The attacker registers a new workspace on the free tier. They upload one innocuous doc so the app considers them a real tenant.
2
Ask a generic question
They ask the bot something broad: 'What are our Q4 revenue targets?' Their own workspace has nothing on that — but another customer's leaked board deck does.
3
Watch the citations
The retriever pulls chunks from the other tenant's documents. The model summarizes them and cites document IDs from a workspace the attacker does not belong to.
4
Probe for more
They iterate on questions. Salary bands, customer lists, internal roadmaps — anything their free-tier neighbour uploaded is now one embedding away.

A single shared index turned into a free-tier espionage tool. Every tenant was reading every other tenant's documents through polite, well-formatted answers.

Vulnerable vs Fixed

Vulnerable — similarity search with no tenant filter

// rag/query.ts
export async function answer(question: string) {
  const embedding = await embed(question);

  const results = await index.query({
    vector: embedding,
    topK: 5,
    // No filter — every tenant's chunks are in the pool.
  });

  return llm.chat({
    messages: [
      { role: 'system', content: 'Answer using the provided context.' },
      { role: 'user', content: question + '\n\n' + results.join('\n') },
    ],
  });
}

Fixed — every query scoped to the calling tenant

// rag/query.ts
export async function answer(question: string, ctx: { tenantId: string }) {
  const embedding = await embed(question);

  const results = await index.query({
    vector: embedding,
    topK: 5,
    filter: { tenantId: ctx.tenantId }, // hard scope on every query
  });

  return llm.chat({
    messages: [
      { role: 'system', content: 'Answer using the provided context.' },
      { role: 'user', content: question + '\n\n' + results.join('\n') },
    ],
  });
}

The 'my RAG bot answers everyone with everyone's data' bug
Vector and Embedding Weaknesses

What your AI actually built

How it gets exploited

Vulnerable vs Fixed

A real case

A shared RAG index leaked one tenant's docs to another

Related reading

Glossary

References

Find out if your RAG bot is leaking across tenants.

The 'my RAG bot answers everyone with everyone's data' bug
Vector and Embedding Weaknesses

What your AI actually built

How it gets exploited

Vulnerable vs Fixed

A real case

A shared RAG index leaked one tenant's docs to another

Related reading

Glossary

References

Find out if your RAG bot is leaking across tenants.

The 'my RAG bot answers everyone with everyone's data' bugVector and Embedding Weaknesses

What your AI actually built

How it gets exploited

Vulnerable vs Fixed

A real case

A shared RAG index leaked one tenant's docs to another

Related reading

Glossary

References

Find out if your RAG bot is leaking across tenants.

The 'my RAG bot answers everyone with everyone's data' bugVector and Embedding Weaknesses

What your AI actually built

How it gets exploited

Vulnerable vs Fixed

A real case

A shared RAG index leaked one tenant's docs to another

Related reading

Glossary

References

Find out if your RAG bot is leaking across tenants.

The 'my RAG bot answers everyone with everyone's data' bug
Vector and Embedding Weaknesses

The 'my RAG bot answers everyone with everyone's data' bug
Vector and Embedding Weaknesses