Meet the Ever-Increasing Demands for Workflow Performance with Intel's 4th Gen Xeon Processors
In this article
As a longstanding Intel partner, WWT is excited for Intel to unveil its 4th Gen Intel Xeon Scalable processor, designed to accelerate workflow performance while also increasing power utilization efficiencies. The combination of improved core resource engineering and accelerators will supply customers with unique built-in acceleration capabilities that improve data center performance across today's fastest-growing and most demanding workloads, including analytics, artificial intelligence (AI), networking, storage and high-performance computing (HPC).
We have been closely involved with each generational release of Intel Xeon Scalable processors. For this latest version, we have been evaluating the exciting advancements it offers in our Advanced Technology Center (ATC), envisioning and designing how it represents a significant leap forward in our ability to deliver enterprise-ready solutions for our customers.
Here are a few highlights of the new 4th Gen Intel Xeon Scalable processor that we believe make it stand out not only as a critical advancement for today but also as foundational model for data center modernization.
We believe a key feature of the 4th Gen Intel Xeon Scalable processor is its ability to increase efficiency and utilization while reducing compute latency and lowering total cost of ownership (TCO) – in part enabled by Compute Express Link (CXL) 1.1 for next-generation workloads. Designed to address the needs of high-performance heterogenous systems, CXL 1.1 maximizes hardware resources while boosting speed and performance in virtually every aspect of the data center, complementing the benefits of a composable disaggregated infrastructure (CDI).
As we shared earlier this year, CXL 1.1 is an open industry standard interconnect that provides high-speed connectivity between CPU-to-device and CPU-to-memory. CXL 1.1 leverages a second critical technology – 80 lanes of PCIe 5.0 physical layer infrastructure (increased from 64 lanes on prior generation) – to create a common memory space across the host and all available resources, including those residing separately from the chassis. Before CXL, those resources would have remained isolated and unavailable. Among the growing list of new devices that support CXL 1.1 and PCIe 5.0 for high-bandwidth connectivity and acceleration across a range of applications is the Intel Agilex FPGA portfolio.
In short, the goal is to derive maximum value from all available resources for increased data center performance while supporting emerging CDI solutions. CXL helps accomplish this, fundamentally altering tomorrow's data center server architecture. In doing so, CXL represents the first step in what will become a foundational technology shift in data center design.
New accelerator features Intel Advanced Matrix Extensions (Intel AMX) and Intel Data Streaming Accelerator (Intel DSA) support growing AI and HPC workloads.
Intel is working continuously to innovate and advance existing accelerator technologies, such as its unique Intel Advanced Matrix Extensions (Intel AMX), a built-in accelerator for advanced performance of deep learning training and inference on 4th Gen Intel Xeon Scalable processors. Ideal for AI, including machine learning and deep learning, Intel reports that Intel AMX can achieve up to 10x higher PyTorch real-time inference performance (BF16) and up to 10x higher PyTorch training performance (BF16) vs. the prior generation (FP32). It also accelerates demanding workloads such as natural language processing, recommendations systems and image recognition.
Another new on-chip feature, Intel Data Streaming Accelerator (Intel DSA), improves streaming data movement and transformation operations, accelerating high-performance storage, networking and data-intensive workloads. Ideal for 5G and networking, Intel DSA offloads common data movement tasks that contribute to data center overhead, accelerating traffic across processors, memory, storage and other devices. Intel DSA eliminates the need for deploying multiple separate accelerators for these tasks – now it's on the chip itself.
Intel states performance improvements for non-accelerated software queues including:
- Up to 95 percent higher vSwitch throughput for packet sizes above ~800B for 200Gbps bi-directional switching with built-in Intel Data Streaming Accelerator (Intel DSA) compared to existing software only implementation.
- Up to 96 percent lower latency at the same throughput (RPS) with Intel Dynamic Load Balancer (Intel DLB) vs. software for Istio ingress gateway working on 6 cores/12 threads.
It's important to note that Intel AMX and Intel DSA acceleration solutions are unique to the 4th Gen Intel Xeon Scalable processor – key differentiators for Intel that will prove their value in data centers of today and tomorrow.
To answer the demand for high performance computing, the Intel Xeon CPU Max Series, is the first and only x86-based processor with high bandwidth memory (HBM), accelerating many HPC workloads without the need for code changes. Putting HBM directly on the chip instead of requiring a separate accelerator card advances HPC capabilities by bringing modeling from a multi-vector application to near real-time versus weeks. This capability delivers real-world benefits for important applications such as climate research, weather simulation and even pandemic tracking – use cases in which immediate insights can be literally a matter of life and death.
To enhance application performance on high-value workloads even further, with this latest generation processor Intel is advancing support for DDR5, going from 3,200MHz on the prior generation to 4,800MHz on the new Intel Xeon processors – a faster base speed and bigger memory footprint that can accommodate higher-capacity DIMM modules while consuming less power to achieve the same performance specs as previous generations.
Other 4th Gen Intel Xeon Scalable processor features can enable customers to:
- Run cloud and networking workloads with faster cryptography using fewer cores for more efficient utilization and sustainability.
- Utilize up to 47 percent fewer cores to achieve the same connections/second using integrated Intel QAT vs. the prior generation on NGINX TLS Webserver handshakes.
- Automatically protecting cache flushes on shutdown or crash, treating the CPU cache as persistent memory to preserve data with minimal performance impact.
- Improve security and protect your investment in heavy workload performance, Intel Software Guard Extensions (Intel® SGX) and other security features establish a zero-trust strategy that unlocks opportunities for business collaboration even while working with sensitive data assets. Built-in security accelerators for encryption help free up CPU cores while improving performance.
Rethink your data center infrastructure with the significant performance gains from a technology refresh.
From an industry perspective, the array of unique features in the 4th Gen Intel Xeon Scalable processor – notably Intel AMX and Intel DSA, CXL 1.1 and PCIe 5.0, HBM2e and vcache advancements, as well as DDR5 memory support – are especially important given their night-and-day performance gains over the first generation of Intel Xeon Scalable processors only a few years ago.
When you're running an expensive software solution that's licensed based on processor cores, it's essential to be running on the best technology to achieve the greatest return on your investment. With this next-gen processor you can truly do more with less: freeing up processor resources with built-in acceleration for new levels of utilization, efficiency and sustainability.
Given the pace of IT advancement – and especially foundational innovations such as this one – now is the time to reconsider your infrastructure refresh cycle. Previously it might have made sense for prudent CFOs to look at a five- to seven-year refresh cycle. But the performance gains and cost savings to be realized with the 4th Gen Intel Xeon Scalable processor call for a serious rethinking of that timetable.
These are just a few of the advantages that the 4th Gen Intel Xeon Scalable processor will bring to the modern data center in early 2023 and beyond. In addition to WWT's evaluation of the new processor in our ATC, we're also coordinating the new processor's rollout with major OEM vendors, including Dell, HPE and Cisco, as well as availability in public cloud vendors, and we will be introducing enterprise-level solutions in the next few months.
Contact WWT for a conversation about incorporating the 4th Gen Intel Xeon Scalable processors into your data center modernization and for scheduling a future demonstration in the ATC.