Blog • April 13, 2026 • 10 minute read

AI Agent Security Needs a Control Plane, Not Just a Sandbox

The first wave of AI agents proved they can work. The next wave will be defined by whether enterprises can govern what those agents can access, install and do.

In this blog

AI agents are moving from novelty into workflows that touch real systems. They do not just answer prompts. They connect to messaging apps, install skills, call tools, interact with files and live workspaces, and generate code that can change the environment around them. OpenClaw helped make that future visible by giving people a way to run, message and extend a persistent agent in the real world.

The problem is that OpenClaw also made the security gap visible. In early 2026, NVD published CVE-2026-25253 for an OpenClaw issue that could send a token during an automatic WebSocket connection, and Oasis Security disclosed a separate chain that allowed a malicious website to take over a locally running OpenClaw agent until users updated to version 2026.2.25 or later. Those incidents did not just expose bugs in one product. They exposed the reality that an AI agent is a new kind of attack surface: software that can reason, reach and act.

That is why the next chapter of AI adoption will not be defined by model quality alone. It will be defined by whether enterprises can answer a much more practical question: what exactly is this agent allowed to touch, and how do we know it stayed within bounds?

OpenClaw showed why agents need governance

Traditional application security assumes relatively fixed interfaces, known users and bounded workflows. Agentic systems stretch all three. Once an agent can install helpers, call external tools, work with a live workspace and write code into that environment, the challenge is no longer just protecting the application. The challenge is governing a chain of decisions, permissions and side effects across the whole workflow. Cisco captures this well in its OpenClaw security framing: the key question is not just what the agent can do, but what happens if it trusts the wrong component.

OpenClaw's own security documentation makes the same point in a different way. The project says its guidance assumes a personal assistant trust model with one trusted operator boundary per gateway. It explicitly says OpenClaw is not a hostile multi-tenant boundary for mutually untrusted users sharing one gateway, and recommends splitting trust boundaries with separate gateways, OS users, or hosts when isolation is required. It also says there is no perfectly secure setup, and that operators should be deliberate about who can talk to the bot, where it is allowed to act, and what it can touch. That is not just product guidance. It is the blueprint for the category.

In other words, the conversation has already moved past whether agents are useful. They are. The real question is how to let them create value without giving them unmanaged access to systems, data and workflows that were never designed for autonomous software.

A sandbox is the starting point, not the destination

This is where NemoClaw matters. NVIDIA introduced NemoClaw™ in March 2026 as an open-source reference stack for securing OpenClaw deployments. With one command, NemoClaw installs OpenClaw and signs the NVIDIA OpenShell runtime to provide an isolated sandbox, privacy routing and policy-based guardrails. From an enterprise perspective, that is important because it establishes the first layer any serious agent deployment needs: containment. It is the beginning of a perimeter.

OpenShell NemoClaw's security model is notable because it is explicit. NVIDIA documents deny-by-default controls across four layers: network, filesystem, process and inference. The platform describes those layers as the structure of the agent's isolated world and the gateway that mediates what is allowed out of it. It also highlights hardened blueprints with capability drops and least-privilege network rules. That is exactly the kind of baseline enterprises need at the start of the journey. At the same time, NVIDIA is transparent that OpenShell and NemoClaw are still alpha software and should not be used in production yet. That honesty is helpful because it reinforces the bigger truth: this market is early, and the control model is still forming.

Still, a sandbox alone is not enough. A sandbox can limit blast radius, but it cannot answer every governance question. It does not tell you whether a skill should have been trusted in the first place. It does not decide whether a prompt contains an injection attempt, whether a tool call is trying to exfiltrate data, or whether agent-written code introduces dangerous behavior into the environment. Enterprises need the layer above containment, not just the container around it.

From perimeter to defense in depth

The diagram shows how an AI agent requires a layered control plane

That is the promise behind DefenseClaw. Cisco positions it as the operational governance layer for OpenClaw, and the project repository describes a simple rule: components should be scanned before they run, and dangerous behavior should be blocked automatically. In practice, that means scanning skills, plugins and MCP servers before they are admitted; monitoring those directories continuously for changes; inspecting prompts, completions and tool invocations at runtime; and using CodeGuard to review agent-written code for patterns like embedded secrets, unsafe execution, path traversal and risky network calls. The project also stores audit data locally and supports forwarding events to SIEM tools such as Splunk.

This is the shift executives should focus on. The future of agent security is not one magic feature. It is a control stack. One layer constrains what the agent can reach. Another decides what the agent is allowed to install. Another inspects what the agent is trying to do in real time. Another captures evidence so security and compliance teams can prove what happened after the fact. Cisco summarized the division of labor clearly in its broader enterprise framing: OpenShell constrains what an agent can do at the infrastructure level, while Cisco AI Defense verifies that what it reaches was trustworthy and behaved as expected. That is what a control plane looks like in practice.

This matters because agents do not fail in only one way. They can be over-permissioned. They can trust the wrong extension. They can be manipulated by prompt injection. They can write dangerous code. They can leak secrets. They can drift from their original purpose over time as new tools and connections are added. A serious control plane treats all of those as connected governance problems, not isolated technical bugs.

Kubernetes as the execution layer for agent security

Kubernetes plays a critical role in turning agent security from theory into practice. It provides the operational foundation that allows enterprises to apply security controls consistently across many agents, environments and teams. What matters is not Kubernetes itself, but what it enables: repeatable enforcement of policy, isolation, segmentation and observability at scale.

The building blocks are already well understood. Pod Security Admission enforces workload boundaries. Network policies control how agents communicate. RBAC defines what agents and their components are allowed to access. Secrets management ensures sensitive data is protected and accessed with least privilege. Runtime controls such as seccomp, AppArmor or SELinux help constrain behavior at execution time. Resource limits reduce the risk of abuse or denial-of-service scenarios.

When combined, these capabilities form the operational backbone of an agent security control plane. They allow organizations to move beyond one-off hardening efforts and into a model where security is consistently enforced, observable and scalable across the enterprise.

The open source world is moving in the same direction

The most encouraging sign in this market is that the direction is broader than any single vendor. Kubescape 4.0 is a good example, highlighted by InfoQ and detailed in the CNCF's recent official release announcement. The release moved runtime threat detection to general availability, manages rules and bindings as Kubernetes CRDs, and supports alert export into existing operational stacks. More importantly for this discussion, it introduced AI-era capabilities aimed at both helping AI assistants understand Kubernetes security posture and securing AI agents themselves. On the agent side, Kubescape 4.0 added posture scanning for KAgent and introduced controls aimed at issues such as empty security contexts, missing NetworkPolicies and over-privileged namespace watching.

That is a meaningful signal. It says the industry is starting to converge on a new assumption: agents are not just another workload to deploy. They are systems that need policy, inspection, runtime visibility and dedicated posture management of their own. It also says the open-source ecosystem is beginning to supply the operational scaffolding enterprises will need if they want to scale beyond a few experiments and into governed production use.

Where this market is going

Over time, AI agent security will likely look less like point hardening and more like an enterprise control plane. The winning architecture will combine containment, least privilege, admission control, runtime inspection, secrets management and evidence collection into one operating model. In that model, the question is not whether an agent is safe in some absolute sense. The question is whether the organization can continuously define, enforce and verify the boundaries around that agent's behavior. That is the difference between experimentation and governance.

This is where a framework becomes essential. WWT's AI Readiness Model for Operational Resilience (ARMOR) offers a vendor-agnostic structure for exactly this shift. ARMOR organizes enterprise AI security into six domains: governance, risk and compliance; model protection; infrastructure security; secure AI operations; secure development lifecycle; and data protection with cyber resilience as the through line. For agent deployments specifically, that mapping is useful. A framework like this gives enterprises a common language for deciding which controls they have, which they are missing, and where their agent governance actually stands.

The immediate challenge is that the tooling is still early. OpenClaw's own documentation frames the system as an evolving experiment rather than a perfectly secure configuration. NVIDIA NemoClaw and OpenShell are alpha. DefenseClaw is newly open-sourced. Kubescape's AI-agent posture work is brand new. But that should not be read as a reason to wait. It should be read as a reason to build the right control assumptions now, before agent sprawl gets ahead of enterprise governance.

What executives should do now

First, define trust boundaries before defining features. Decide which agents are personal productivity tools, which are enterprise workflow agents, and which should never cross those lines. OpenClaw's guidance is a useful reminder here: when mixed trust or adversarial users are in play, split the boundaries with separate gateways, OS users or hosts. The core principle is simple. If an agent touches sensitive systems, it should run in a dedicated boundary with minimal permissions and a clearly scoped purpose.

Second, demand an admission model for tools, extensions and skills. In the agent world, a bad component can be as dangerous as a bad identity. That means nothing new should be trusted blindly. It should be scanned, approved, monitored for drift, and, when necessary, blocked automatically. This is where projects like DefenseClaw are pointing the market, and it is a requirement that will outlast any specific product name.

Third, start with visibility, then move toward enforcement. Cisco's runtime model supports observe mode before action mode, which is a practical approach for organizations that want to understand real agent behavior before turning on blocking. That pattern makes sense more broadly. Enterprises should first learn where agents reach, what they install, what they write and how they behave in the wild. Then they should convert those observations into policy.

Finally, measure success in governance terms, not demo terms. The right questions are not just how impressive the agent looks or how many tasks it can complete. The right questions are whether you can prove what it accessed, whether you can restrict what it installs, whether you can stop unsafe behavior in real time, whether you can protect secrets and sensitive data, and whether your controls can scale across teams and environments without becoming manual overhead. Kubernetes and the emerging open-source ecosystem matter because they offer a path to operationalize exactly that.

AI agent adoption is moving fast. The organizations that benefit most will not be the ones that simply deploy more agents. They will be the ones that build the control plane around them. A sandbox is necessary. But for the enterprise, it is only the beginning.