Evolving the data center amid the rise of AI

The rise of AI workloads and corresponding architectures have captivated the technology world for good reason. However, as AI increasingly takes center stage, it's important that we don't lose sight of traditional data center workloads.

It's easy to fall into the trap of allowing our existing architectures to run on autopilot while we look to enact AI mandates, yet to do so risks the successful deployment of those next-generation workloads and architectures. As IT leaders, it's our job to remain focused on the core services we provide that make the "next big thing" possible.  

Here, we focus on the traditional data center, but it would be impossible to discuss associated challenges and changes without acknowledging AI and the demands that those workloads will place on existing data center infrastructure. 

For example, simply accessing the data needed from current datastores and ingress/egress from high-performance systems will introduce demands not predicted or sized for, placing timelines and performance at significant risk. Designing AI infrastructures is a separate topic, but designing data centers to interact with AI infrastructures is a new reality.

This blog is the first in a series intended to explore the nuances of modern data center architecture and provoke discussion on the convergence of formerly discrete technology silos. No longer can the pillars of data centers — storage, networking, compute, security and application platforms — be evaluated in isolation.

Network infrastructure

Data center network infrastructure has undergone many changes over the past decades, and not just in terms of speed and medium of transport. Network topologies have evolved from classic Core/Aggregation/Access strategies into Spine/Leaf (Clos) designs, and even into Super Spine/Multi-Pod architectures for the largest networks.  

Changed too are the management and routing techniques for these evolved data center networks. Improvements in the underlying hardware in network switches have allowed even the smallest of data center switches to handle advanced traffic routing and distribution techniques and protocols to provide more efficient use of data center assets.

While classic network architectures could conceivably be managed "by hand" by experienced network architects, as networks grew, outages and misconfigurations become more commonplace. Even the largest networks and service providers are not immune from human error as demonstrated by high profile outages and associated service disruption.

Routing and network extension techniques such as EVPN and VXLAN, as well as introducing granular network segmentation approaches, can provide significant benefits and efficiencies in network topologies, but they can also be complex to configure, support and monitor.

Luckily for the modern network architect, many solutions have emerged to simplify the deployment and management of these evolved topologies. Cisco's Application Centric Infrastructure (ACI), HyperFabric and Nexus Dashboard offer multiple approaches to data center fabric management, while solutions such as Arista CloudVision and Juniper Apstra offer their own philosophies on modern virtualized network configuration and management.  

Each solution has unique differentiators that may be more applicable to different types of workloads, management styles or integrations into cloud-model orchestration platforms. Gone are the days of simply selecting switches based on speed or protocol support. Management and observability now take top priority.

For example, Cisco AI Canvas, Juniper Data Center Assurance powered by Marvis, and Arista Autonomous Virtual Assist (AVA) show how the industry is rapidly evolving to include AI approaches to the management and remediation of data center networks.

As data centers see an influx of AI workloads, the data center network must adapt to accommodate interconnection with these often purpose-built networks.

Interconnection from the primary data center network to an AI pod frequently requires significant bandwidth and brings routing and traffic management considerations that are very different from traditional workloads. Link speeds exceeding 100Gb are common, as well as requiring large numbers of interconnect links to allow for efficient distribution of traffic from frequent ingress or egress of AI workload data. 

Even those data centers not currently implementing AI workloads would be well served to consider having the edge/leaf capabilities to accommodate interconnection should the need arise, without having to re-architect or significantly alter the topology.

Historical trends in network designs often favored standardization and elimination of variability in switch models, interconnect strategies and over/underlay technologies. The disparity in requirements of various workloads now requires that multiple topologies, switch architectures and features, and transport approaches coexist and interact.

This complexity necessitates purposeful design and modern management platforms, along with the ability of a trusted partner to navigate the constantly changing capabilities and limitations of the various platforms available.

Compute

Compute also is undergoing an evolution to accommodate the needs of modern workloads. While AI gets most of the press related to the use of accelerators such as GPUs, other workloads are also beginning to adapt to the availability of accelerator technologies.  

Network virtualization, security applications and graphical rendering are all early adopters of offload strategies, requiring redesigning compute form factors to accommodate accelerators, both for physical footprint as well as increased power and cooling demands. Advances in processor capabilities further complicate form factor designs as power and cooling demands extend beyond traditional design considerations. 

It is becoming clear that air cooling alone will not support the modern data center within the next several generations of processors and accelerator technologies. The most performant solutions on the market require some measure of alternative cooling, with mainstream products already pushing the limits of air cooling.

While less performant solutions can work within existing heat and power envelopes, physical density and efficiency must also be considered. In the cases of new data center construction, a liquid cooling strategy should be a primary consideration. 

Retrofitting an entire data center to accommodate liquid cooling is costly and disruptive, so many environments choose to implement a hybrid strategy with in-rack/row solutions coupled with air cooling to extend the life of existing data centers while minimizing disruption.

As workloads evolve to include real-time, actionable data processing, compute resources are being deployed to the network edge, closest to where data is being collected so that decisions can be made in real time. For example, use cases such as retail computer vision, industrial automation and battlefield intelligence analysis demand that data be processed locally. 

Edge deployments are often in locations with limited local technical expertise, unregulated environmental conditions and physical access security risks. Higher failure rates of components are often experienced due to the environmental factors. Coupled with the lack of local service capabilities, the opportunity for disruption remains high.

As with data center networking, homogeny can no longer be assumed for compute platforms. A combination of form factors, accelerator usage, cooling technologies, and deployment location means that manual, by-hand management and monitoring is simply not practical or sustainable.  

Instead, a comprehensive platform is needed to allow management and visibility of this varied landscape, especially to include predictive analytics to improve response to impending failures and recovery from disruptions. 

Various cloud-delivered and on-premises solutions address these considerations, such as Cisco Intersight, HPE OneView and Dell OpenManage Enterprise.  Selection of the most fitting platform will depend on many factors, including form factor requirements, deployment locations and methods, and integration with virtualization or workload management solutions.

Enterprise storage

Enterprise storage remains one of the foundational pillars of the modern data center, yet it is undergoing a transformation as profound as computing and networking. 

Historically, storage systems were designed around predictable workloads and relatively static capacity planning. Today, the rise of AI, analytics and real-time applications has shattered those assumptions. Modern workloads demand not only massive scalability but also unprecedented levels of performance, resiliency and flexibility.

Traditional storage architectures — built on monolithic arrays and tiered disk strategies — are giving way to software-defined and disaggregated solutions that integrate tightly with virtualization and container platforms. These approaches allow organizations to pool resources, automate provisioning and dynamically allocate performance tiers based on workload requirements. 

NVMe over Fabrics (NVMe-oF) is further accelerating this shift, reducing latency to levels once reserved for in-memory computing. Meanwhile, the adoption of denser media like QLC-based Flash enables higher storage capacities and more cost-effective scaling, empowering modern data centers to meet the explosive growth of data while maintaining efficiency.

The relationship between enterprise storage and the modern data center is no longer transactional; it is symbiotic. 

Storage platforms must seamlessly integrate with orchestration tools, security frameworks and hybrid cloud strategies. They must also anticipate the unique demands of AI and high-performance computing, where data locality and throughput can dictate the success of an entire initiative. Designing for these realities means considering not just capacity, but also data mobility, replication strategies and the ability to scale horizontally without disruption.

As with networking and compute, management and observability are paramount. Modern storage platforms increasingly incorporate AI-driven analytics for predictive failure detection, anomalous behavior detection and performance optimization. Solutions from vendors such as Dell PowerStore, NetApp ONTAP and Pure Storage exemplify this trend, offering APIs and integrations that enable storage to function as part of a cohesive, automated data center fabric rather than a standalone silo.

Virtualization and container platforms 

Even as organizations continue to expand their on-premises environments, they face a pivotal choice in how those environments evolve. 

A number of organizations are prioritizing cloud-like capabilities that allow their data centers to compete with the agility and service models of Hyperscale providers ("Private Cloud" with a capital P & C). Others are focused on cost-optimized solutions designed to deliver virtualized infrastructure at the lowest possible expense ("private cloud"). Both directions are viable, but they represent very different philosophies about the role of IT in the business.

Most organizations fall somewhere in between these extremes. Rather than a binary decision, the landscape is more of a gradient, with varying levels of automation, orchestration and optimization layered onto existing virtualization platforms. 

Some environments evolve incrementally, adopting orchestration in targeted areas, while others pursue a more ambitious transformation into service-driven platforms. The right path depends on whether agility, cost efficiency or a blend of both is the greater priority. 

At WWT, we work closely with our primary partners — VMware by Broadcom, Nutanix, Microsoft and Red Hat— to help clients identify where they want to be on this spectrum and adopt the solution stack that aligns with their long-term strategy.

Conclusion

It has become clear that the modern data center must be designed with management, observability and analytics first, with underlying hardware selections being made to support the former.  

Gone are the days of simply selecting the fastest, most feature-rich or favored partner's product, and deciding how to manage it later. This approach simply does not scale or adapt to the demands of the present or future. Today, designs must be management and observability led. 

Consider engaging the experts at World Wide Technology to help identify your data center goals and begin the transformation into a more efficient and resilient modern data center.  

We offer Modern Data Center briefings and workshops to discuss overall goals and trends, as well as deep technical evaluations of specific products and solutions to accelerate decision-making, design and implementation to meet your organization's evolving technology needs.