Partner POV | From AI Pilots to Production: Building Infrastructure That Makes AI Real

This article was written by Jeremy Foster, Senior Vice President & General Manager at Cisco

At NVIDIA GTC in Washington, D.C., this week, Cisco shared how we're advancing Cisco Secure AI Factory with NVIDIA—the enterprise foundation for AI that runs securely, observably, and at scale. The momentum spans four pillars: security, observability, core AI infrastructure, and ecosystem partnerships.

We'll focus here on core AI infrastructure—the connective tissue that turns innovation into impact.

Networking: From fabric to Kubernetes, policy that travels with the workload

AI pipelines are expanding across data centers, clouds, and edge sites. As they scale, the network determines whether they feel fast and governable—or fragile.

Cisco Isovalent Enterprise Networking for Kubernetes is now validated for inference workloads on Cisco AI PODs, extending enterprise-grade policy and observability from the physical fabric into Kubernetes itself.

The result: a consistent operating model from wire to workload. The same segmentation and telemetry principles that secure the underlay now define how services communicate within clusters. Platform teams can maintain speed and governance without fragmenting their network stack.

Looking ahead, Cisco Nexus Hyperfabric for AI will deepen this convergence. Built to treat AI as an end-to-end workload, it will simplify how fabrics are designed, deployed, and expanded across training and inference environments. Intent-based blueprints will encode bandwidth and latency requirements common to distributed training and vector workloads, aligned with Cisco Validated Designs. Isovalent and Hyperfabric are shaping a unified path forward—policy, performance, and visibility aligned across every layer.

Compute: A unified runway from pilot to production

Scaling AI shouldn't mean building separate systems for every stage of the journey. The latest compute platforms from Cisco provide a single foundation that grows from pilot to production.

Cisco UCS C880A M8, with NVIDIA HGX B300 and Intel Xeon 6 processors with performance cores, enables large-scale training with high GPU density, predictable east-west performance, and enterprise-grade telemetry. It serves as a performance cornerstone of Cisco AI PODs, engineered for throughput and serviceability.

Complementing it, Cisco UCS X-Series X580p node and X9516 X-Fabric technology make UCS X-Series a certified NVIDIA RTX PRO 6000 Blackwell Server Edition, bringing high-bandwidth, future-ready connectivity inside the chassis. Together, these platforms create a unified compute roadmap—training, fine-tuning, and inference on one operational track.

Each server includes NVIDIA Spectrum-X SuperNICs to scale across an AI cluster, as well as NVIDIA BlueField-3 DPUs to accelerate GPU access to data. And together with NVIDIA AI Enterprise software, these Cisco UCS compute platforms can accelerate the development and deployment of production-grade, end-to-end generative AI pipelines.

What customers see in practice is a unified compute roadmap rather than a patchwork of silos. Training scale lands on UCS C880A M8; adjacent and downstream services expand across X-Series with the fabric headroom to handle shifting I/O and accelerator profiles. Because both ends of the spectrum live inside Cisco Validated Designs—and are automated and observed through Intersight—fleet operations stay consistent as estates grow. That consistency is the point: faster paths from pilot to production, fewer surprises during upgrades, and a platform that can absorb new workloads without rewriting the runbook.

Ecosystem: Choice without chaos

AI success depends on collaboration. Customers want the freedom to use familiar tools without inheriting operational sprawl. Under Cisco Secure AI Factory with NVIDIA, Cisco is expanding its ecosystem to deliver choice where it matters, consistency where it counts.

NVIDIA Run:ai introduces GPU orchestration built for the Kubernetes era. It transforms fragmented accelerator capacity into a governed, shareable utility—enforcing priorities, reclaiming idle resources, and integrating with Kubernetes namespaces for cost transparency. On Cisco AI PODs, it runs atop a substrate designed for predictable east-west performance with Nexus fabrics and lifecycle management through Intersight. The outcome: higher sustained utilization, shorter queue times, and fewer stranded GPU hours.

Nutanix Kubernetes Platform (NKP) simplifies day-2 operations with predictable upgrades, drift control, and Git-based policy—keeping clusters current and compliant across environments, including air-gapped or regulated sites. Paired with Nutanix Unified Storage (NUS), which merges file and object access, teams can move data efficiently through the AI pipeline without duplicating data sets or losing provenance.

Run:ai, NKP, and NUS bring operational clarity to complex AI systems. In a typical flow, data lands in NUS; clusters run on NKP; workloads are orchestrated by Run:ai; and performance is delivered by Cisco UCS and Nexus, with Intersight providing fleet-level visibility. The result: utilization trends up, complexity trends down, and every new workload builds on a stronger foundation than the last.

Momentum you can operationalize

Cisco and NVIDIA are building the infrastructure that turns AI from promise into production—securely, observably, and at scale.

Partner POV | From AI Pilots to Production: Building Infrastructure That Makes AI Real

In this article

Networking: From fabric to Kubernetes, policy that travels with the workload

Compute: A unified runway from pilot to production

Ecosystem: Choice without chaos

Momentum you can operationalize

Technologies