In 2026 everyone has shipped an AI agent. Far fewer have shipped one they could defend in an audit. Surveys keep finding the same gap: most security leaders are worried about AI-agent risk, and only a handful have actually put mature controls around it. Teams are deploying agents faster than they can govern them, and in healthcare, finance, or anywhere regulated, that's how a great demo becomes a reportable breach.
Here's the category error underneath it: a capable model is not a compliant system. You can point the best model in the world at protected health information (PHI) and still be wildly non-compliant. Compliance isn't a property of the model; it's a property of the architecture around it. (We've argued before that a good model doesn't make a tool HIPAA-compliant; with agents, the gap gets wider.)
A chatbot reads and replies. An agent does things: it calls tools, queries databases, writes records, sends messages, and remembers across turns. Every one of those is a new place regulated data can leak or an unlogged action can happen:
The model is maybe 10% of the risk. The other 90% is everything the agent is wired to touch.
If an agent goes near PHI, these aren't nice-to-haves; they're the difference between "audit-ready" and "liability":
Doesn't matter. Read-only still means PHI egresses to wherever you embed and store it. RAG still puts patient data in a vector store and a prompt. The questions an auditor asks (where did the data go, who could see it, what's logged, who signed a BAA) don't care whether your agent writes anything. They care where the data went.
The winners in regulated AI aren't the teams with the flashiest agent. They're the teams whose agent can pass the audit, because the governance was designed in, not bolted on after the demo got applause. If your agent touches PHI (or card data, under PCI), build the framework first and let the model be the easy part.
That's how we build production AI for regulated industries: compliance by design, with the agent governance auditors actually ask for. If that's the bar your product has to clear, here's how we work.
If you're running agents on regulated data, what's your hardest governance problem right now? Logging, BAAs, or keeping humans in the loop without killing the UX?