Partner POV | Cyber-ready AI: Why Enterprise AI Security Can't Stay in the Sandbox

This article was written by Andy Stone, CTO-Americas at Pure Storage.

One of the most fascinating aspects of AI's journey from experimentation to production deployment has been the limitations it exposes when it makes its way out of the sandbox. Maybe it's a proof of concept spun up in the cloud that's successful until it hits a barrier to scale. Or the carefully mapped infrastructure budget that goes out the window three years before the next refresh. Or, maybe it's a sandbox environment with security approaches that work fine in the lab but break down when the AI is scaled across an enterprise data estate.

If that last one sounds alarming, it should.

In the controlled environment of data science labs, success is relative—but security can be, too. Sandboxes often bear little resemblance to the complex, distributed reality of enterprise AI, where they ultimately deliver on their promise. The security measures that appear manageable with gigabytes of test data create vulnerabilities (and compliance and governance headaches) when terabytes of customer information and sensitive data fill the pipeline.

The Security Gap in AI Data Architectures

IBM's 2025 Cost of a Data Breach Report revealed that 13% of organizations experienced breaches of AI models or applications, with 97% of those compromised lacking proper AI access controls.

Bolting on multi-vendor solutions that treat storage as a "passive" component leaves organizations vulnerable. They can miss out on crucial threat signals hidden in their data. And, ransomware targeting AI data lakes and model repositories will thrive on fragmentation between storage platforms and cybersecurity tools.

When attackers target your AI training data sets and model checkpoints—the intellectual property that represents months or years of investment—traditional backup strategies won't be enough.

Once organizations start depending on AI-driven decisioning, automation, or analytics, any outage or data corruption can be as damaging as a breach itself. Resilience becomes as critical as security. AI resilience means ensuring continuous availability and recoverability of models, data sets, and pipelines so that business operations aren't disrupted when (not if) something fails. True protection combines security, integrity, and rapid recovery—because an AI system that's "secure" but offline still costs the business dearly.

The AI Data Governance Factor

A recent study found that major AI governance frameworks (e.g., NIST, ALTAI, UK's toolkit) leave large portions of risk unaddressed, with compliance-security gaps as high as 80%.

Here's where AI adoption collides with a wave of new compliance requirements. From the EU's AI Act to rapidly tightening cross-border data regulations in the U.S. and Asia, enterprises face a landscape where governance is no longer optional. Enterprises now face blind spots not just in how AI models behave but in how data flows across borders, how sensitive information gets logged, and how regulatory audits are satisfied. In financial services, regulators require that every transaction gets logged with context to see how AI behaved and what may have happened should an issue arise.

Compliance obligations also go beyond "checkbox" security. Regulators increasingly demand hard evidence that data is encrypted and that immutable logs exist, that recovery procedures are tested, and that governance extends across the entire AI pipeline, from training data sets to deployed inference workloads.

The Cloud: Where AI Problems Generally Start

Here's the heart of the AI problem: Most AI initiatives start in the cloud, where there are generally very few controls put in place, almost intentionally, to accelerate model development.

That's fine for a while, but eventually, companies want to start integrating their private, confidential, and valuable data with these models (via RAG). That's where the problems arise, because most companies, especially large, regulated ones, aren't going to trust putting that data into some untrusted model in a public space, which means they'll have to bring the models back on premises to a trusted, controlled environment to do that integration work.

This cloud-to-on-prem migration presents various problems because now the company has to find a way to migrate the model and the data, which can take a lot of time, resulting in very expensive resources (i.e., data scientists) being fully occupied and dramatically slowing the development of the service.

Is Zero Trust the Answer?

While ZTA isn't a catchall, it's a vast improvement on traditional network security techniques. Its foundational principle is this: Assuming trust anywhere in security is a flawed approach. Implicit trust should never be granted to a user, device, or application based only on that user, device, or application's location on a secure network. Every user and system must authenticate and validate their identity before accessing network resources.

That principle of least privilege has always been a cornerstone of ZTA: Give users only the access they need, nothing more. But the rise of AI has changed things. And remember that zero trust isn't only about policies and guardrails—it rests on hygiene. Keeping IAM directories free of stale accounts, patching systems to eliminate known exploits, and conducting regular application security reviews are the day-to-day practices that make ZTA workable at scale. Without this hygiene, even the strongest guardrails can fail.

AI copilots and autonomous agents now interact with corporate systems, often with broad privileges. If overprovisioned, they can exfiltrate data or trigger changes at machine speed. These non-human identities (NHIs) include APIs, service accounts, microservices, IoT devices, and increasingly, AI agents. They outnumber human accounts by more than 80:1, and many use hard-coded credentials, shared keys, or static tokens—exactly the kinds of "back doors" ZTA was meant to eliminate.

To address this, companies must apply the same rigor to machines as they do to humans by using credential vaulting, frequent credential rotation, attribute-based access policies, and behavioral monitoring. The future is dynamic guardrails—AI-enforced least privilege. Instead of static rules, policies adapt continuously based on user, device, and contextual risk signals.

[AS: We also need a way to monitor the controls to ensure that they're not only implemented/in place, but they're also effective and working.]

Pure Storage Cyber-aware, Governance-first AI Infrastructure

Our approach makes the storage layer an active participant in AI data governance, compliance, threat detection, and response. Pure Storage directly addresses the aforementioned "most AI issues start in the cloud" scenario via solutions like Pure Storage Cloud Dedicated Cloud Block Store, which helps bring portability to data and lighten the gravitational load often experienced by using standard cloud storage environments. Pure Storage also provides a number of other features like snapshots for fast checkpointing as AI training is occurring and SafeMode™ to protect data from potentially hostile action.

Pure Fusion™ serves as an intelligent control plane, automatically registering FlashArray™ systems in security workflows and enabling policy enforcement across all AI infrastructure. This creates seamless governance that maintains consistent oversight whether AI workloads run in public cloud, private cloud, or on-premises environments.

Pure Protect™ recovery zones automatically provision isolated recovery environments for non-disruptive testing and validation of AI applications and data sets, ensuring seamless continuity. This enables organizations to remediate and recover from malicious attacks without impacting production AI workloads, providing immediate restoration capabilities for mission-critical AI factories.

With our extensive native threat detection and partnerships with CrowdStrike, Veeam, and Superna, we have an extended threat detection network that shares bi-directional threat signals across the entire AI data pipeline. We also built our own Threat Model Mentor GPT at Pure Storage, an AI-powered tool that automates threat modeling to help democratize cybersecurity expertise.

The NVIDIA AI Factory Security Foundation

As an NVIDIA-Certified Storage Partner with both Foundation and Enterprise level validation, we meet the highest standards of quality, efficiency, and reliability for AI factories. The platform features TPM and UEFI secure boot capabilities, enterprise-grade identity and access management, and bring-your-own-key encryption for multi-tenant environments. This architecture supports NVIDIA's Enterprise Reference Architectures with validated configurations from 4 to 1,024+ GPUs, and the unified software stack—combining NVIDIA Base Command Manager, NVIDIA AI Factories, Portworx® by Pure Storage, and NVIDIA Run:ai—creates policy-driven security orchestration across the entire AI pipeline.

Security-first Storage for AI: The Real Silver Bullet

As I stated earlier, AI has done it again: It's exposed an architectural limitation to traditional storage and demanded something better. Even the organizations not investing in large-scale AI can recognize this important signal: AI won't be the last big disruption, but they can weather it and be ready for whatever comes next.