Incident Knowledge Assistant

Solution overview

In this lab we will explore the Incident Knowledge Assistant and the full stack that it is optimized for, starting with the underlying HPE environment, then moving up the NVIDIA stack, and finally landing with the full stack software solution. We will explain the various layers of the project to clearly investigate the benefits that each component brings to power the full solution.

Incident Knowledge Assistant

The Incident Knowledge Assistant leverages AI-powered analysis to help IT operations teams quickly respond to and resolve trouble tickets. By parsing knowledge base articles, previous incidents, known ongoing problems and recent changes, it identifies similar resolved incidents, relevant work instructions, ongoing environmental issues that could be related and recent work that may have caused the problem. This enables teams to quickly triage the issues and respond faster to the individual experiencing the problem.

HPE Private Cloud AI Platform

We start with an overview of the HPE Private Cloud AI Platform and the benefits that the platform provides through a secure, well known, and completely modularized platform. The key components of this platform that we will guide you through is the foundation of what many AI powered systems gloss over. How is the data secured, ingested into, and then leveraged to bring an "AI Solution" into production? The HPE Private Cloud AI platform helps answer the many questions that any IT organization has to answer.

Where is our data stored?
How does this tie into our existing data stores?
How can I maintain the integrity of the data?

HPE Private Cloud AI helps to answer these questions by providing a robust, alll in one solution to storage, orchestration, and compute in one integrated system. The Incident Knowledge Assistant leverages this system by integrating with the built in HPE Greenlake storage, Ezmeral orchestration layer, and NVIDIA L40s GPUs.

NVIDIA Stack

The HPE Private Cloud AI system comes with several different configurations to fit a wide variety of use cases. For this use case, we chose to use the smaller configuration powered by two L40s GPUs. The L40s has 48 GBs per GPU which means model sizes are limited to the maximum amount of memory available per GPU. In this case, the largest LLM model that can be used with 48GB of memory is a 32B parameter model. Leveraging that, we have chosen to focus our efforts on picking the right inference NVIDIA NIM for this job with a supporting cast of other NIMs to fill the system. NVIDIA NIM, part of NVIDIA AI Enterprise, is a set of easy-to-use microservices designed for secure, reliable deployment of high-performance AI model inferencing across clouds, data centers and workstations. This solution relies on:

OpenAI's GPT OSS 20B NIM
NVEmdebQA E5 - Large NIM
Llama-3.2-nv-rerankqa-1b-v2 NIM

To leverage these NVIDIA NIMs, we have chosen to use the NVIDIA NeMo Agent Toolkit (NAT) as the foundational orchestration framework of the Incident Knowledge Assistant. NeMo Agent Toolkit is an open-source AI framework for building, profiling, and optimizing agents and tools from any framework, enabling unified, cross-framework integration across connected Agent systems. NAT has allowed us to quickly develop the AI pipeline to ingest large amounts of Service Management data and make targeted recommendations from that data.

Full Stack AI Solution

The last layer of this lab will focus on the full stack solution that powers the Incident Knowledge Assistant. Built using a mixture of Python and JavaScript, the Incident Knowledge Assistant serves as an example of modern full stack architecture that is built for deployment at scale. At WWT, we understand the need to build solutions that seamlessly fit into our clients' organizational landscape. Some portions of an AI application will always be new, but the other technical decisions don't have to be. Using modern software development practices in tandem with new technologies allows WWT to quickly build solutions that are ready for the enterprise.

Lab diagram

Goals and objectives

Explore the Incident Knowledge Assistant with example data
Learn the benefits of NVIDIA's NeMo Agent Toolkit.
Observe the deployment of NVIDIA AI Enterprise components on the HPE Private Cloud AI environment.

Hardware and software

NVIDIA Cumulus 100GB Switches
HPE GreenLake for Files Array
2 x HPE DL380A w/L40 GPU's
HPE Aruba Switching
3 x HPE DL325 Gen 11
HPE AI Essentials
HPE Ezmeral
NVIDIA AI Enterprise
NeMo Agent Toolkit
OpenAI OSS 20B NIM
PG Vector
nv-embedqa-e5-v5 for embedding
llama-3.2-nv-rerankqa-1b-v2 for reranking

Incident Knowledge Assistant

Solution overview

Incident Knowledge Assistant

HPE Private Cloud AI Platform

NVIDIA Stack

Full Stack AI Solution

Lab diagram

Goals and objectives

Hardware and software

Technologies

What's next?