A modern AI platform for the enterprise 

At the heart of this solution is a layered, cloud-native architecture that aligns rapid AI innovation with the operational rigor required in enterprise environments. Rather than treating AI as a standalone tool, the platform integrates compute, orchestration, MLOps, and foundation model services into a unified ecosystem that supports the full AI lifecycle from experimentation to large-scale production.

The foundation of the platform is built on Intel Gaudi 3-based systems, delivering the high-performance acceleration required for training and serving modern AI and foundation models. These systems provide the throughput, memory bandwidth, and scalability needed to support demanding workloads while maintaining cost-efficient operations.

 

IBM Watson X Platform Architecture

IBM Watsonx.ai: Enterprise-Grade AI Studio

IBM Watsonx.ai provides the tools to build, fine-tune, govern, and deploy foundation-model-based AI solutions. It enables organizations to move beyond isolated experiments and create AI applications that are trusted, explainable and integrated into real business workflows.

With Watsonx.ai, teams can:

  • Develop and orchestrate generative AI solutions
  • Leverage foundation models for reasoning and content generation
  • Manage prompts, pipelines and model lifecycles
  • Apply governance, monitoring and risk controls

Red Hat OpenShift: The Hybrid Cloud Foundation

Red Hat OpenShift delivers the enterprise Kubernetes platform required to run AI workloads consistently across on-premises, public cloud, and edge environments. It provides:

  • Secure, containerized infrastructure
  • Built-in scalability and resilience
  • DevSecOps and CI/CD integration
  • Hybrid and multi-cloud portability

This ensures AI workloads are not locked into a single environment and can evolve alongside business needs.

Red Hat OpenShift AI: End-to-End MLOps

OpenShift AI extends OpenShift with a comprehensive data science and MLOps layer, supporting:

  • Data preparation and notebooks
  • Distributed training and pipelines
  • Model serving and monitoring
  • Lifecycle automation and governance

It operates Watsonx.ai workloads, enabling teams to manage the full AI lifecycle from development to deployment and continuous optimization.

Intel® Gaudi® 3: High-Performance AI Acceleration

Intel Gaudi 3 accelerators are purpose-built for large-scale AI training and inference. They deliver:

  • High throughput for foundation models
  • Strong performance-per-dollar economics
  • High-bandwidth memory and networking
  • An open, developer-friendly software ecosystem

Running this stack on Gaudi 3 systems allows enterprises to scale AI workloads efficiently while maintaining flexibility and control.

Supermicro AI Training Super Server

The infrastructure used to support the platform is Supermicro Super Server SYS-822GA-NGR3, which is an 8U rackmount AI training platform designed for large-scale machine learning, deep learning, LLMs (Large Language Models), and HPC workloads.

           

Heavy duty fans

Key features

  • High-Density Accelerators: Supports up to 8 Intel Gaudi® 3 AI accelerators (OAM form factor) for massive parallel computing and model training.
  • Dual CPU Support: Dual Intel® Xeon® 6900 series processors with P-cores via LGA-7529 sockets, up to 128 cores/256 threads per CPU.
  • Memory Capacity: 24 DIMM slots supporting up to 6 TB of DDR5 ECC memory (RDIMM/LRDIMM at 6400 MT/s or MRDIMM at 8800 MT/s).
  • Networking: 6 × OSFP 800 GbE ports onboard — ideal for high-bandwidth interconnects in AI clusters.

Storage:

8 hot-swap 2.5″ NVMe Gen5 bays upfront

2 additional M.2 PCIe 5.0 x2 NVMe slots for boot or cache. 

PCIe Expansion:

2 × PCIe 5.0 x16 FHFL slots

2 × PCIe 5.0 x8 FHFL slots

1 × PCIe 5.0 x4 AIOM (OCP 3.0) slot.

Power, Cooling & Chassis

Redundant Power Supplies: 8 × 3000 W Titanium Level (96% efficiency) for stable, high-power delivery. 

Chassis: 8U rackmount with heavy-duty cooling fans optimized for dense GPU/accelerator loads. 

Management & Security: Super Cloud Composer,  Supermicro Server Manager (SSM), hardware TPM 2.0, secure boot, firmware signing, and remote management options.

                                                                               

Why this stack matters

Performance with Choice

Intel Gaudi 3 delivers enterprise-class acceleration without locking organizations into proprietary ecosystems.

Open by Design

Red Hat OpenShift ensures portability, interoperability, and long-term flexibility across environments.

Trusted Enterprise AI

IBM Watsonx.ai brings governance, transparency, and integration, key requirements for regulated and large-scale enterprises.

Operational Excellence

OpenShift AI provides the tooling to manage AI as a living system, not a one-time project.

Intel Gaudi 3: Accelerating RAG & Visual Summarization Workflows

Intel Gaudi 3 is the latest third-generation AI accelerator from Intel, designed for high-performance generative AI and large-scale RAG pipelines including multimodal tasks like video summarization and vision-augmented retrieval. It builds on Gaudi 2's architecture with increased compute, memory bandwidth, and efficient scalability for LLMs and multimodal AI workloads.

While Intel's primary focus was on text-centric RAG, video and visual summarization emerges naturally as a next step.

Integrating Visual Summarization into RAG

Modern video summarization techniques use vision-language models (VLMs) to process video frames, audio transcripts, and visual embeddings to create concise, narrative summaries of long videos. It extracts key events and semantic context, dramatically simplifying browsing and search.

Visual Summarization

Combining vision models with RAG enables "visual RAG summarization" workflows where:

  • Frames or segments are embedded into vector stores,
  • Retrieval retrieves relevant segments based on text or visual queries,
  • LLMs articulate summaries, answers, or highlights in natural language.

This methodology is increasingly highlighted in industry demonstrations (e.g., retail surveillance summaries and interactive Q&A interfaces over video).

WWT and Intel created an architecture based on Visual RAG using Langchain based orchestration and Llama Large Language Model where Visual Question Answering (VQA) is the task of answering open-ended questions based on an image. The input to models supporting this task is typically a combination of an image and a question, and the output is an answer expressed in natural language.

 

Visual RAG Architecture

Visual RAG architechture

 

Video Summarization Architecture

Video summarization architecture

 

Why Intel Gaudi 3 for Video RAG?

Compared to traditional GPU clusters:

  • Optimized specifically for generative AI workloads
  • High memory capacity for multimodal models
  • Ethernet-based scale-out
  • Competitive cost-per-token economics
  • Open software ecosystem

For enterprises running OpenShift, Gaudi integrates cleanly into Kubernetes-native workflows without proprietary lock-in.

Video and Visual Summarization represent the next frontier of enterprise AI, converting unstructured video into searchable, explainable intelligence.

By combining:

  • Intel Gaudi 3 accelerators
  • IBM WatsonX.ai platform
  • Red Hat OpenShift container platform
  • OpenShift AI model lifecycle management
  • Vector databases for retrieval

Enterprises can deploy an open, scalable, secure, and high-performance video intelligence platform capable of handling petabyte-scale archives.

This architecture turns video from passive storage into an active decision-support system.

Intel Gaudi 3 is purpose-built to power production-scale generative AI across different industry verticals that includes:

  • Consumer Goods and Retail
  • Healthcare and Medicine
  • Manufacturing
  • Media and Entertainment
  • Financial Services

The path forward

As AI becomes embedded into the core of enterprise operations, success will depend less on isolated models and more on robust platforms. Platforms that can scale, adapt, and govern AI responsibly.

By deploying IBM watsonx.ai with Red Hat OpenShift and OpenShift AI on Intel Gaudi 3 systems, organizations establish a future-proof foundation for enterprise AI, one that turns innovation into impact.

The future of AI is not just about intelligence. It's about infrastructure, integration, and execution at scale.

Technologies