The AI Proving Ground: Empowering IT Teams to Drive Their Organization's AI Success

AI's potential to completely transform how organizations operate is exciting. Whether AI is used to enable new revenue streams, cultivate client loyalty through personalization, drive efficiencies through automation, or extract data-driven insights to guide decision-making, the possibilities are limited only by the imagination.

As data centers, workflows and applications evolve to support AI's technical demands, navigating the complexities of this modernization work — including integrating AI solutions with legacy IT systems — can overwhelm even the most seasoned IT professionals. The explosion of generative AI (GenAI) has heightened the need for organizations to modernize their data centers and quickly embrace high-performance architecture (HPA). So, where do organizations go from here?

Enter the AI Proving Ground from WWT.

What is the AI Proving Ground?

The AI Proving Ground is a dynamic environment composed of industry-leading software, hardware and component solutions that can be integrated quickly. Combined with the knowledge and experience of our AI and infrastructure experts, and supported by our longstanding manufacturer partnerships, the AI Proving Ground allows organizations to experience the art of the possible for themselves while accelerating their time to market.

Developed within our Advanced Technology Center (ATC), this one-of-a-kind lab environment empowers IT teams to evaluate and test AI infrastructure, software and solutions for efficacy, scalability and flexibility — all under one roof. The AI Proving Ground provides visibility into data flows across the entire development pipeline, enabling more informed decision-making while safeguarding production environments.

By addressing common hurdles to AI success — including hardware availability, high costs, skills gaps, power and cooling concerns, connectivity challenges, environment management, and complex architecture designs — the AI Proving Ground enables organizations to quickly, confidently and safely develop transformational AI solutions that deliver real business results in a fraction of the time and expense it would take to achieve on their own.

Discover what's inside the AI Proving Ground

How clients are using the AI Proving Ground

Over the last several months of working with clients in the AI Proving Ground, we've witnessed growing demand for WWT's GPU-as-a-service offering. This on-demand service gives clients access to a collection of powerful GPU resources critical for powering both pre-built and customizable AI applications.

Below are a few of the other ways clients use the environment:

How might *your* organization benefit from the AI Proving Ground?

Risk-free learning

Fear of disrupting production environments often hinders experimentation. The AI Proving Ground allays that fear by providing a safe and secure sandbox for data scientists, data center engineers and software developers to learn, test, iterate and innovate. It's a playground for bold ideas, unencumbered by the inherent constraints and risks of live systems.

The AI Proving Ground supports IT professionals with hands-on access so they can evaluate AI hardware, software and reference architectures before deploying these solutions in their production environments:

Data scientists can use the AI Proving Ground to evaluate a range of AI models, including large language models (LLMs) for GenAI and application development; natural language processing (NLP) models for smart assistants, language translation and digital phone call response; and computer vision models for image classification, object recognition or object tracking.
Facilities engineers can use the AI Proving Ground to investigate the impact that prospective modernization efforts and AI integrations will have on existing data centers, and they can test drive the latest innovations in cooling technologies.
IT infrastructure engineers can use the AI Proving Ground to validate hardware and software integrations, assess the performance-per-watt of AI workflows, and ensure the overall supportability of a desired AI solution. They can also test their ability to provision, reclaim and understand chargeback components for each business unit.
Security engineers can use the AI Proving Ground to compare and contrast the ability of new AI solutions to protect the organization against attacks, breaches, and the exposure or loss of sensitive data.
Software engineers can use the AI Proving Ground to design and deploy AI solutions in a hybrid environment that features easy access to cloud, on-premises, edge and cloud-adjacent components.

What makes up the AI Proving Ground?

From a technical lens, the AI Proving Ground is a heterogeneous "lab of labs" ecosystem that currently houses 15 different AI environments.

Our AI Proving Ground labs strategically range in focus from reference architectures to automated component orchestrations. By seizing this unique opportunity to validate the performance of the latest AI hardware and software integrations, technologists can quickly and confidently pursue the types of AI-powered solutions that deliver the most business value.

15 AI labs and growing: Environment details

This section details each of the 15 lab environments currently operating in the AI Proving Ground. In the coming months, we plan to expand the number of dedicated AI labs available to our clients and partners.

Environment 1: Innovation lab with NVIDIA DGX

The NVIDIA DGX H100 is a prescriptive AI system from NVIDIA designed for enterprise AI workflows. The DGX H100 environment inside the AI Proving Ground features four DGX H100 systems, two different 400GbE Ethernet fabrics (Cisco and Arista), 400Gb NVIDIA InfiniBand fabric, and seven different storage providers to choose from (Dell, NetApp, Pure Storage, VAST Data, IBM, DataDirect Networks, HPE GreenLake/VAST Data). This lab also includes NVIDIA AI Enterprise software platform and Run:ai's Optimization and Orchestration solution.

Use cases: Clients can use this lab to understand the level of effort needed to build and support their own AI environments and validate the integration of different enterprise networking and storage solutions. Clients can also leverage the lab's synthetic load generation solutions to record power and performance metrics or even build their own use cases to better understand the performance of different workloads within an integrated solution of their choice.

Environment 2: NVIDIA GH200 Grace Hopper Superchip lab

The NVIDIA GH200 Grace Hopper Superchip is a small but powerful breakthrough processor designed for giant-scale AI and high-performance computing (HPC) applications. The GH200 lab environment inside the AI Proving Ground supports a single GH200 appliance with both 400Gb InfiniBand and 400GbE Ethernet connections. NVIDIA's superchip can deliver up to 10 times the performance for applications running terabytes of data.

Use cases: Clients can test HPC workloads on the GH200, recording performance and power metrics for each test.

Environment 3: Composable XPU-as-a-Service lab

The AI Proving Ground features a dedicated GPU-as-a-Service (GaaS or GPUaaS) environment. This fully automated solution enables our engineers to build physical server environments with different server, CPU, GPU and operating system options. Thanks to our Liqid Composable Disaggregated Infrastructure solution and RackN Digital Rebar Platform, dedicated specialized builds can happen within minutes without the need to physically touch the servers.

The following options are available for specialized, dedicated servers:

Server partners: Dell and HPE
CPU partners: Intel and AMD
GPU partners:
- NVIDIA: A100, A30, L40
- Intel: Flex 140, Flex 170, Max 1100
- AMD: MI210
Operating systems: RHEL 8, RHEL 9 and Ubuntu 22.04

Use cases: In this lab, clients can engage servers with different configurations without the need to fully build and integrate different accelerators by hand simply to evaluate different AI models from training or inference standpoints.

Environment 4: Intel Gaudi performance cluster lab

Intel's Gaudi AI accelerator cluster supports a single Gaudi-1 appliance (with eight first-generation deep learning processors) and a single Gaudi-2 appliance (with eight second-generation deep learning processors). Each HPC appliance can leverage both local NVMe storage or high-speed storage systems via a dedicated 100GbE network fabric.

Use cases: Clients can leverage this lab to validate different deep learning training or inferencing solutions while recording both performance and power metrics during testing.

Environment 5: NVIDIA Omniverse Digital twin lab (with Dell) cluster

Our digital twin environment features dedicated Dell 16G PowerEdge server nodes to support NVIDIA's Omniverse developer platform and its L40 and A40 GPUs. This lab supports Omniverse's database and collaboration engine components as well (e.g., the Enterprise Nucleus Server) in a dedicated environment, allowing developers to build highly scalable, high-performance solutions.

Use cases: Clients can use this lab to evaluate and build digital twin solutions specific to their needs within a NVIDIA Omniverse framework.

Environment 6: Dell reference architecture lab (with NVIDIA)

This Dell reference architecture environment is a full-stack solution. Hardware components include dedicated Dell PowerSwitches for high-speed networking, Dell PowerEdge accelerator-optimized compute nodes (XE9680 and R760xa servers), and Dell PowerScale storage (the F600 array). The lab also features multiple MLOps and Kubernetes platform solutions, and it's enabled with NVIDIA H100 and L40 GPUs. Clients can choose to apply the NVIDIA NVAIE framework to the environment or choose a preferred MLOps and Kubernetes solution.

Use cases: This lab allows clients to evaluate full-stack solutions from both management and performance validation standpoints, including power consumption and performance metrics.

Environment 7: Data scientist development for GenAI lab (with Dell and RedHat)

Our data scientist development cluster is a dedicated HPC environment that enables our data scientists to develop and train the LLMs used in GenAI solutions. This environment consists of an integrated solution that includes dedicated high-speed networking via Dell PowerEdge 15G servers and NVIDIA A100 GPUs. Additionally, the environment is configured with an OpenShift container platform and a dedicated MLOps solution that gives data scientists their own dedicated workspace within the cluster to execute their work without fear of interfering with others who are also utilizing the cluster. The team can check in approved models for quick access.

Use cases: In this lab, data scientists can demonstrate different LLM solutions and training techniques.

Environment 8: Cisco FlashStack for Generative AI

This specialized lab offers an in-depth exploration of the FlashStack for AI Infrastructure, a cutting-edge platform for AI development. Designed to support both new and seasoned practitioners, the lab facilitates an understanding of how to effectively size, configure and deploy AI architectures using the Cisco FlashStack. The environment includes a mixture of UCSX Chassis, Fabric Interconnect's, M7 Blades and C240 rack mounts, all running either NVIDIA L40-48C or A100-80GB GPUs; plus integration with local NVMe storage and high-speed storage systems, including Pure FlashBlade, through a dedicated 100GbE network fabric.

Use cases: This environment serves as a crucial resource for those looking to harness the power of Cisco FlashStack for AI, providing the expertise needed to elevate AI projects with optimized performance and efficiency.

Environment 9: AI security and application services lab (with Dell and Liqid)

This AI lab environment contains Dell PowerEdge 15G servers, a Liqid CDI fabric for dynamic swapping, and additional GPUs, as needed.

Use cases: The lab features a dynamic, dedicated cluster that enables our security and application services teams to evaluate and showcase independent software vendor (ISV) solutions that leverage accelerators for faster, more accurate outcomes. This AI lab environment gives organizations the ability to leverage multiple types of GPUs from WWT's roster of manufacturing partners.

Environment 10: Intel AI-Reference Kit lab (with Dell and RedHat)

The Intel AI-Reference Kit cluster consists of a five-node Dell PowerEdge 16G server environment with Intel 5th Generation Xeon Scalable processors. The nodes are connected via 100GbE speed Ethernet connections and are managed by a RedHat OpenShift AI platform. Users can leverage one of the lab's five prebuilt AI solutions, or they can request that we build a special instance from one of Intel's other 29 solutions.

Use cases: Clients can experience demonstrations of one of the five prebuilt solutions on demand or reserve a cluster for their specific use case validation.

Environment 11: Liqid CDI POC lab (with Dell, Liqid and more)

The Liqid proof of concept (POC) cluster features composable disaggregated infrastructure solutions from Liqid along with Dell PowerEdge 15G servers. The dedicated environment currently supports two Dell PowerEdge Intel-based servers, two Dell PowerEdge AMD-based servers, and an 8-slot Liqid Chassis that can be populated with Intel, AMD, or NVIDIA GPUs along with a Liqid NVMe IO Accelerator storage cards.

Use cases: Clients can validate the performance of in-server GPUs vs Liqid-attached GPUs for VDI, inference or training workflows.

Environment 12: HPE reference architecture for GenAI lab (with Aruba)

Our HPE reference architecture lab environment inside the AI Proving Ground is a full-stack solution. Hardware components include dedicated Aruba high-speed networking, HPE ProLiant and Cray accelerator-optimized compute nodes (Cray XD-670 and ProLiant DL-385 servers), and a dedicated HPE Greenlake/VAST Data array. The environment includes an MLOps platform from Determined AI as well as data fabric from Pachyderm, and it is enabled with NVIDIA H100 and L40 GPUs.

Use cases: Clients can use this lab to evaluate full-stack solutions, including power consumption and performance metrics, from both management and performance validation standpoints.

Environment 13: NetApp AI Pod (with NVIDIA)

NetApp's ONTAP AI Base Pod (AIPod) is a dedicated NVIDIA NeMo RAG demo environment. The lab includes dedicated high-speed networking, an NVIDIA DGX H100 appliance and a NetApp AFF800 array. The environment will showcase NetApp's BlueXP portfolio and highlight NetApp's ability to provide industry-leading data mobility and multi-tenancy for AI workloads in all the ATC-connected public cloud providers. This instance will be the first of the AI Proving Ground's hybrid cloud solutions.

Use cases: The environment will leverage NVIDIA NeMo Framework to quickly deploy different RAG environment frameworks including NeMo Retriever with NetApp storage endpoints (StorageGRID, ONTAP, FSxN, ANF and GCNV) that can be configured as vertical-specific use cases with the appropriate data ingestion.

WWT is hard at work developing and implementing our own AI-powered solutions across key areas of our business. This is a sample of the projects we're working on.

Environment 14: Dell Reference Architecture for Generative AI with AMD

The Dell Reference Architecture environment is a full-stack solution that includes Dedicated Dell PowerSwitch High-Speed Networking, Dell PowerEdge Accelerator Optimized Compute nodes (XE9680) and Dell PowerScale Storage (F710 Array) as the hardware components. It also includes ML and the Kubernetes Platform options. We invite our clients to evaluate this full-stack solution from a management and performance standpoint. The environment is enabled with AMD's MI300X GPU and 4th Generation EPYC Processors. Clients can choose both their ML platform and their Kubernetes platform to validate both hardware and software integration.

Use cases: Clients can use this lab to evaluate full-stack solutions from both management and performance validation standpoints, including power consumption and performance metrics. Clients can also leverage the environment to evaluate different LLM, SLM, and RAG (retrieval-augmented generation) solutions.

Environment 15: HPE Private Cloud AI

The HPE Private Cloud AI is the first turnkey AI solution of its kind. This solution combines the power of NVIDIA AI computing, networking and software with HPE's AI hardware and software to create a streamlined approach for building your GenAI solutions. Because this solution is purpose-built and designed to be on-premise, it helps customers and partners remove complexity and risk from their AI journeys. It also allows you, as the customer, to experiment and innovate with NVIDIA NIMs and expedite use case validation and model building faster.

Use cases: The HPE Private Cloud AI lab is an excellent place for customers looking for a solution to test their GenAI use cases in a safe on-premise environment.

Conclusion

As AI and data solutions continue to evolve across manufacturers and industries, so too will the AI Proving Ground. WWT is dedicated to enhancing and scaling the capabilities of the AI Proving Ground, in close collaboration with our partners, to deliver cutting-edge AI-powered solutions and high-performance architectures that generate real business value.

AI is revolutionizing the way we do business. Together, we can drive innovation to make a new world happen.