Blog • March 4, 2026 • 6 minute read

Building the Future of Enterprise AI: IBM watsonx.ai on Red Hat OpenShift with Intel Gaudi 3

Enterprise AI is entering a new phase, one defined not just by powerful models, but by the platforms that operationalize them. Organizations today need AI environments that are open, scalable, secure, and cost-efficient, capable of supporting everything from experimentation to mission-critical production workloads. By combining IBM watsonx.ai, Red Hat OpenShift, Red Hat OpenShift AI, and Intel® Gaudi® 3 accelerators, enterprises gain a next-generation AI stack designed to meet these demands delivering performance, flexibility, and enterprise readiness in a single, integrated platform.

In this blog

A modern AI platform for the enterprise

At the heart of this solution is a layered, cloud-native architecture that aligns rapid AI innovation with the operational rigor required in enterprise environments. Rather than treating AI as a standalone tool, the platform integrates compute, orchestration, MLOps, and foundation model services into a unified ecosystem that supports the full AI lifecycle from experimentation to large-scale production.

The foundation of the platform is built on Intel Gaudi 3-based systems, delivering the high-performance acceleration required for training and serving modern AI and foundation models. These systems provide the throughput, memory bandwidth, and scalability needed to support demanding workloads while maintaining cost-efficient operations.

IBM Watsonx.ai: Enterprise-Grade AI Studio

IBM Watsonx.ai provides the tools to build, fine-tune, govern, and deploy foundation-model-based AI solutions. It enables organizations to move beyond isolated experiments and create AI applications that are trusted, explainable and integrated into real business workflows.

With Watsonx.ai, teams can:

Develop and orchestrate generative AI solutions
Leverage foundation models for reasoning and content generation
Manage prompts, pipelines and model lifecycles
Apply governance, monitoring and risk controls

Red Hat OpenShift: The Hybrid Cloud Foundation

Red Hat OpenShift delivers the enterprise Kubernetes platform required to run AI workloads consistently across on-premises, public cloud, and edge environments. It provides:

Secure, containerized infrastructure
Built-in scalability and resilience
DevSecOps and CI/CD integration
Hybrid and multi-cloud portability

This ensures AI workloads are not locked into a single environment and can evolve alongside business needs.

Red Hat OpenShift AI: End-to-End MLOps

OpenShift AI extends OpenShift with a comprehensive data science and MLOps layer, supporting:

Data preparation and notebooks
Distributed training and pipelines
Model serving and monitoring
Lifecycle automation and governance

It operates Watsonx.ai workloads, enabling teams to manage the full AI lifecycle from development to deployment and continuous optimization.

Intel® Gaudi® 3: High-Performance AI Acceleration

Intel Gaudi 3 accelerators are purpose-built for large-scale AI training and inference. They deliver:

High throughput for foundation models
Strong performance-per-dollar economics
High-bandwidth memory and networking
An open, developer-friendly software ecosystem

Running this stack on Gaudi 3 systems allows enterprises to scale AI workloads efficiently while maintaining flexibility and control.

Supermicro AI Training Super Server

The infrastructure used to support the platform is Supermicro Super Server SYS-822GA-NGR3, which is an 8U rackmount AI training platform designed for large-scale machine learning, deep learning, LLMs (Large Language Models), and HPC workloads.

Key features

High-Density Accelerators: Supports up to 8 Intel Gaudi® 3 AI accelerators (OAM form factor) for massive parallel computing and model training.
Dual CPU Support: Dual Intel® Xeon® 6900 series processors with P-cores via LGA-7529 sockets, up to 128 cores/256 threads per CPU.
Memory Capacity: 24 DIMM slots supporting up to 6 TB of DDR5 ECC memory (RDIMM/LRDIMM at 6400 MT/s or MRDIMM at 8800 MT/s).
Networking: 6 × OSFP 800 GbE ports onboard — ideal for high-bandwidth interconnects in AI clusters.

Storage:

8 hot-swap 2.5″ NVMe Gen5 bays upfront

2 additional M.2 PCIe 5.0 x2 NVMe slots for boot or cache.

PCIe Expansion:

2 × PCIe 5.0 x16 FHFL slots

2 × PCIe 5.0 x8 FHFL slots

1 × PCIe 5.0 x4 AIOM (OCP 3.0) slot.

Power, Cooling & Chassis

Redundant Power Supplies: 8 × 3000 W Titanium Level (96% efficiency) for stable, high-power delivery.

Chassis: 8U rackmount with heavy-duty cooling fans optimized for dense GPU/accelerator loads.

Management & Security: Super Cloud Composer, Supermicro Server Manager (SSM), hardware TPM 2.0, secure boot, firmware signing, and remote management options.

Why this stack matters

Performance with Choice

Intel Gaudi 3 delivers enterprise-class acceleration without locking organizations into proprietary ecosystems.

Open by Design

Red Hat OpenShift ensures portability, interoperability, and long-term flexibility across environments.

Trusted Enterprise AI

IBM Watsonx.ai brings governance, transparency, and integration, key requirements for regulated and large-scale enterprises.

Operational Excellence

OpenShift AI provides the tooling to manage AI as a living system, not a one-time project.

Intel Gaudi 3: Accelerating RAG & Visual Summarization Workflows

Intel Gaudi 3 is the latest third-generation AI accelerator from Intel, designed for high-performance generative AI and large-scale RAG pipelines including multimodal tasks like video summarization and vision-augmented retrieval. It builds on Gaudi 2's architecture with increased compute, memory bandwidth, and efficient scalability for LLMs and multimodal AI workloads.

While Intel's primary focus was on text-centric RAG, video and visual summarization emerges naturally as a next step.

Integrating Visual Summarization into RAG

Modern video summarization techniques use vision-language models (VLMs) to process video frames, audio transcripts, and visual embeddings to create concise, narrative summaries of long videos. It extracts key events and semantic context, dramatically simplifying browsing and search.

Visual Summarization

Combining vision models with RAG enables "visual RAG summarization" workflows where:

Frames or segments are embedded into vector stores,
Retrieval retrieves relevant segments based on text or visual queries,
LLMs articulate summaries, answers, or highlights in natural language.

This methodology is increasingly highlighted in industry demonstrations (e.g., retail surveillance summaries and interactive Q&A interfaces over video).

WWT and Intel created an architecture based on Visual RAG using Langchain based orchestration and Llama Large Language Model where Visual Question Answering (VQA) is the task of answering open-ended questions based on an image. The input to models supporting this task is typically a combination of an image and a question, and the output is an answer expressed in natural language.

Visual RAG Architecture

Video Summarization Architecture

Why Intel Gaudi 3 for Video RAG?

Compared to traditional GPU clusters:

Optimized specifically for generative AI workloads
High memory capacity for multimodal models
Ethernet-based scale-out
Competitive cost-per-token economics
Open software ecosystem

For enterprises running OpenShift, Gaudi integrates cleanly into Kubernetes-native workflows without proprietary lock-in.

Video and Visual Summarization represent the next frontier of enterprise AI, converting unstructured video into searchable, explainable intelligence.

By combining:

Intel Gaudi 3 accelerators
IBM WatsonX.ai platform
Red Hat OpenShift container platform
OpenShift AI model lifecycle management
Vector databases for retrieval

Enterprises can deploy an open, scalable, secure, and high-performance video intelligence platform capable of handling petabyte-scale archives.

This architecture turns video from passive storage into an active decision-support system.

Intel Gaudi 3 is purpose-built to power production-scale generative AI across different industry verticals that includes:

Consumer Goods and Retail
Healthcare and Medicine
Manufacturing
Media and Entertainment
Financial Services

The path forward

As AI becomes embedded into the core of enterprise operations, success will depend less on isolated models and more on robust platforms. Platforms that can scale, adapt, and govern AI responsibly.

By deploying IBM watsonx.ai with Red Hat OpenShift and OpenShift AI on Intel Gaudi 3 systems, organizations establish a future-proof foundation for enterprise AI, one that turns innovation into impact.

The future of AI is not just about intelligence. It's about infrastructure, integration, and execution at scale.