Solution overview

In this lab we will explore the Incident Knowledge Assistant and the full stack that it is optimized for, starting with the underlying HPE environment, then moving up the NVIDIA stack, and finally landing with the full stack software solution. We will explain the various layers of the project to clearly investigate the benefits that each component brings to power the full solution. 

Incident Knowledge Assistant

The Incident Knowledge Assistant  leverages AI-powered analysis to help IT operations teams quickly respond to and resolve trouble tickets. By parsing knowledge base articles, previous incidents, known ongoing problems and recent changes, it identifies similar resolved incidents, relevant work instructions, ongoing environmental issues that could be related and recent work that may have caused the problem. This enables teams to quickly triage the issues and respond faster to the individual experiencing the problem.

HPE Private Cloud AI Platform

We start with an overview of the HPE Private Cloud AI Platform and the benefits that the platform provides through a secure, well known, and completely modularized platform. The key components of this platform that we will guide you through is the foundation of what many AI powered systems gloss over. How is the data secured, ingested into, and then leveraged to bring an "AI Solution" into production? The HPE Private Cloud AI platform helps answer the many questions that any IT organization has to answer. 

  • Where is our data stored?
  • How does this tie into our existing data stores?
  • How can I maintain the integrity of the data?

HPE Private Cloud AI  helps to answer these questions by providing a robust, alll in one solution to storage, orchestration, and compute in one integrated system. The Incident Knowledge Assistant leverages this system by integrating with the built in HPE Greenlake storage, Ezmeral orchestration layer, and NVIDIA L40s GPUs.

NVIDIA Stack

The HPE Private Cloud AI system comes with several different configurations to fit a wide variety of use cases. For this use case, we chose to use the smaller configuration powered by two L40s GPUs. The L40s has 48  GBs per GPU which means model sizes are limited to the maximum amount of memory available per GPU. In this case, the largest LLM model that can be used with 48GB of memory is a 32B parameter model.     Leveraging that, we have chosen to focus our efforts on picking the right inference NVIDIA NIM for this job with a supporting cast of other NIMs to fill the system. NVIDIA NIM, part of NVIDIA AI Enterprise, is a set of easy-to-use microservices designed for secure, reliable deployment of high-performance AI model inferencing across clouds, data centers and workstations. This solution relies on:

  • OpenAI's GPT OSS 20B NIM
  • NVEmdebQA E5 - Large NIM
  • Llama-3.2-nv-rerankqa-1b-v2 NIM

To leverage these NVIDIA NIMs, we have chosen to use the ​NVIDIA NeMo Agent Toolkit ​(NAT) as the foundational orchestration framework of the Incident Knowledge Assistant. NeMo Agent Toolkit is an open-source AI framework for building, profiling, and optimizing agents and tools from any framework, enabling unified, cross-framework integration across connected Agent systems. NAT has allowed us to quickly develop the AI pipeline to ingest large amounts of Service Management data and make targeted recommendations from that data.

Full Stack AI Solution

The last layer of this lab will focus on the full stack solution that powers the Incident Knowledge Assistant. Built using a mixture of Python and JavaScript, the Incident Knowledge Assistant serves as an example of modern full stack architecture that is built for deployment at scale. At WWT, we understand the need to build solutions that seamlessly fit into our clients' organizational landscape. Some portions of an AI application will always be new, but the other technical decisions don't have to be. Using modern software development practices in tandem with new technologies allows WWT to quickly build solutions that are ready for the enterprise.

Lab diagram

Loading

Technologies

What's next?

Learn more about , stay up-to-date with the industry and the new technology we have at WWT.