Skip to main content
securityragabacenterprise-ai

Data Security in the Age of LLMs: Why Your RAG Needs a Real Security Model

InsightMesh Team

Retrieval-Augmented Generation (RAG) has become the default pattern for putting your own data behind a large language model. It works: instead of relying on what the model memorized during training, you retrieve relevant passages from your documents and let the model answer from them. Grounded, current, and far less prone to making things up.

But RAG quietly introduces a security problem that many teams discover only after they’ve shipped. When you build a retrieval layer over “all the company’s documents,” you have built a system whose entire job is to find and surface information across boundaries that used to keep that information apart. If the access model isn’t designed as carefully as the retrieval model, your helpful assistant becomes an efficient way to leak.

The boundary you removed by accident

In most organizations, sensitive information stays separated mostly by friction. The HR folder is somewhere different from the engineering wiki. The board pack isn’t in the same drive as the sales collateral. People don’t see things they shouldn’t largely because finding them would be annoying.

RAG removes that friction on purpose. Ask a question, and the system cheerfully searches everything it was indexed over. Unless access control is enforced at retrieval time — before a passage is ever handed to the model — a well-phrased question can pull back content the asker was never meant to see. The OWASP community lists exactly this class of issue, including sensitive-information disclosure, in its Top 10 for LLM Applications.

Why “the model won’t tell them” is not a security model

A common first instinct is to handle this in the prompt: instruct the model not to reveal certain things. This is not security; it’s a suggestion. Prompt-level guardrails are probabilistic, bypassable, and invisible to your auditors. The right place to enforce access is before retrieval, in a layer the model cannot talk its way around.

That means the question isn’t “what will the model say?” It’s “which documents was this specific user, in this context, ever allowed to retrieve in the first place?”

RBAC gets you part of the way. ABAC finishes the job.

Most systems start with Role-Based Access Control (RBAC): this user is an “admin,” that one is a “viewer.” RBAC is familiar and useful, but it’s coarse. It can say “this user is in Finance.” It struggles to say “this user may read this contract only if it belongs to their business unit, it’s marked active, and it isn’t still in draft.”

That finer-grained logic is Attribute-Based Access Control (ABAC), formalized in NIST’s SP 800-162. ABAC evaluates a request against attributes of the user, the resource, the action, and the context — and that combination is exactly what a RAG system needs to make safe retrieval decisions on a per-query basis.

InsightMesh is built around a dedicated ABAC policy engine that evaluates every request before it reaches any data — or any agent. Retrieval is filtered by policy, not by hope.

Don’t forget the agents

There’s a second boundary that classic RAG security ignores entirely: actions. Once your AI can do more than answer — run a query, extract data, generate a report — you have to govern not just what it can see but what it can do. An agent that can execute SQL is only as safe as the scope it’s confined to.

This is why we apply the same policy engine to the agent layer as to the data layer. An agent inherits the limits of the user it acts for, and can’t take an action its policy doesn’t allow.

A short checklist before you ship RAG

  • Enforce access at retrieval time, not in the prompt.
  • Filter by attributes, not just roles — user, resource, action, and context.
  • Isolate tenants and projects so one context can never retrieve another’s data.
  • Govern agent actions with the same model that governs reads.
  • Keep answers source-backed so every disclosure is traceable and auditable.

RAG made your data useful to AI. A real security model is what makes it safe to leave on. If you’re putting retrieval in front of sensitive documents, design the access layer with at least as much care as the retrieval layer — ideally more.

Want to see ABAC-governed retrieval in practice? Get in touch.