Intelligent Resource Optimizer

Solution overview

In this lab we will explore the Intelligent Resource Optimizer and the full stack that it is optimized for, starting with the underlying HPE environment, then moving up the NVIDIA stack, and finally landing with the full stack software solution. We will explain the various layers of the project to clearly investigate the benefits that each component brings to power the full solution.

Intelligent Resource Optimizer

The Intelligent Resource Optimizer leverages AI-powered analysis to help IT operations teams quickly identify which computing infrastructure in their environment is over or underutilized. By analyzing historical point-in-time utilization statistics for CPU, disk and RAM, it identifies machines which should be either sized up or scaled down based on current patterns. It then goes a steps further and provides a predictive analysis, identifying not only which systems currently have misallocated resources but those likely to be in that position in the near future. This enables teams to proactively right size their compute resources leading to better cost management in over-allocation circumstances and better application performance in under-allocation situations.

HPE Private Cloud AI Platform

We start with an overview of the HPE Private Cloud AI Platform and the benefits that the platform provides through a secure, well known, and completely modularized platform. The key components of this platform that we will guide you through is the foundation of what many AI powered systems gloss over. How is the data secured, ingested into, and then leveraged to bring an "AI Solution" into production? The HPE Private Cloud AI platform helps answer the many questions that any IT organization has to answer.

Where is our data stored?
How does this tie into our existing data stores?
How can I maintain the integrity of the data?

HPE Private Cloud AI helps to answer these questions by providing a robust, all in one solution to storage, orchestration, and compute in one integrated system. The Intelligent Resource Optimizer leverages this system by integrating with the built in HPE Greenlake storage, Ezmeral orchestration layer, and NVIDIA L40s GPUs.

NVIDIA Stack

The HPE Private Cloud AI system comes with several different configurations to fit a wide variety of use cases. For this use case, we chose to use the smaller configuration powered by two L40s GPUs. The L40s has 48 GBs per GPU which means model sizes are limited to the maximum amount of memory available per GPU. In this case, the largest LLM model that can be used with 48GB of memory is a 32B parameter model. Leveraging that, we have chosen to focus our efforts on picking the right inference NVIDIA NIM for this job with a supporting cast of another ML model to fill the system. NVIDIA NIM, part of NVIDIA AI Enterprise, is a set of easy-to-use microservices designed for secure, reliable deployment of high-performance AI model inferencing across clouds, data centers and workstations. This solution relies on:

OpenAI's GPT OSS 20B NIM
Cisco-time-series-model-1.0

To leverage these NVIDIA NIMs, we have chosen to use the NVIDIA NeMo Agent Toolkit (NAT) as the foundational orchestration framework of the Intelligent Resource Optimizer. NeMo Agent Toolkit is an open-source AI framework for building, profiling, and optimizing agents and tools from any framework, enabling unified, cross-framework integration across connected Agent systems. NAT has allowed us to quickly develop the AI pipeline to ingest large amounts of time series utilization data and make targeted recommendations from that data.

Full Stack AI Solution

The last layer of this lab will focus on the full stack solution that powers the Intelligent Resource Optimizer. Built using a mixture of Python and JavaScript, the Intelligent Resource Optimizer serves as an example of modern full stack architecture that is built for deployment at scale. At WWT, we understand the need to build solutions that seamlessly fit into our clients' organizational landscape. Some portions of an AI application will always be new, but the other technical decisions don't have to be. Using modern software development practices in tandem with new technologies allows WWT to quickly build solutions that are ready for the enterprise.

Lab diagram

Goals and objectives

Explore the Intelligent Resource Optimizer with example data
Learn the benefits of NVIDIA's NeMo Agent Toolkit.
Observe the deployment of NVIDIA AI Enterprise components on the HPE Private Cloud AI environment.

Hardware and software

NVIDIA Cumulus 100GB Switches
HPE GreenLake for Files Array
2 x HPE DL380A w/L40 GPU's
HPE Aruba Switching
3 x HPE DL325 Gen 11
HPE AI Essentials
HPE Ezmeral
NVIDIA AI Enterprise
NeMo Agent Toolkit
OpenAI OSS 20B NIM
PG Vector
Time Series Model

Intelligent Resource Optimizer

Solution overview

Intelligent Resource Optimizer

HPE Private Cloud AI Platform

NVIDIA Stack

Full Stack AI Solution

Lab diagram

Goals and objectives

Hardware and software

Technologies

What's next?