HPE Aspire 2022 Takeaways
In This Article
Why is HPC important? For decades, HPC has been a critical part of academic research, government computing, and industry innovation. HPC helps technologists, engineers, data scientists, designers, and researchers solve significant, complex problems in far less time and at less cost than traditional computing.
Now to the business of HPC. If you missed Aspire 2022, I'll recap my top sessions (no order of priority).
Keep your eyes on this solutions solution (and Swarm Learning below). While HPC support for AI has snowballed over the past few years, AI support for HPC has not. Integrating machine learning (ML) and deep learning (DL) into HPC simulations represents a challenging endeavor. HPE's open-source SmartSim solution eased this burden. SmartSim is a library that facilities the usage of popular ML libraries like PyTorch and TensorFlow with HPC simulations. SmartSim allows ML to be written in Python and allows Fortran, C, and C++ simulation environments to cooperate. My interests in SmartSim are across many HPC domains: molecular dynamics, weather and climate, computational fluid dynamics and finite element modeling, and quantum mechanics.
I am gaga about the HPC Edge, and I think HPE's Swarm Learning can help. Swarm Learning is a decentralized ML solution leveraging computing horsepower at or near the distributed data sources with blockchain security. In Swarm Learning, both the model's training and inferencing occur at the edge, where data is most fresh, and prompt data-driven decisions are required. In its decentralized architecture, only learned insights instead of the raw data are shared among collaborating ML peers, enhancing data security and privacy. Use case interests for me include intelligent sensor interactions for cybersecurity, medical, and the industrial internet of things (IIoT).
As manufacturers of all sizes struggle with cost and competitive pressures, the use of computer‑aided engineering (CAE) is growing. Semiconductor and processor designs are enabled with improvements in electronic design automation (EDA) workflows running on HPC infrastructure. With the increased frequency of severe weather, the importance of accurate and timely weather predictions has never been higher. HPC infrastructures range from small clusters to TOP500-class supercomputers. You have a world-class combo by combining HPC simulations with SmartSim and hybrid physics-AI models.
HPE presented an overview of the HPC software portfolio. The software portfolio comprises HPE-developed assets and third-party software that HPC customers use to build their solutions. HPE (Cray) has a long tradition of purpose-developed software for programming environments and system management. With their new software release model, I expect improvements in the HPC customer experience because of the integrated and validated software stack development.
What value is an HPC system if you don't have the management systems to leverage the assets? This session briefly outlined the HPC system management software, the HPE Performance Cluster Manager, and HPE Cray System Management capabilities. HPE Performance Cluster Manager is a comprehensive HPC cluster management tool for Apollo, ProLiant, and Cray EX systems that enable system setup, hardware monitoring and management, power management, and more. HPE Cray System Management brings cloud infrastructure to HPC systems, enabling a resilient, secure, elastic, scalable system management solution with an extensible microservices cloud stack. But we must wait for some of this goodness; these system management tools will be ready at the Gen 11 timeframe.
As the amounts of data used in modern HPC solutions constantly increase, storing the data efficiently and cost-effectively is no longer sufficient. Then, with advanced accelerators, more cores per processor, and faster interconnects, feeding the computational beast is becoming the foremost factor in HPC system architectures. Compound the situation for AI, and the problem is exacerbated by introducing remote memory direct access protocols (RDMA) such as GPU Direct Storage (Magnum I/O), bringing GPUs on par with more traditional processors where RDMA is a standard. While advances in object stores and more traditional file-based storage such as NFS, there are still no substitutes for high-performance, parallel file systems for IOPS, sustainable performance, and growth capacity. Enter HPE's Parallel File System Storage (PFSS). But HPC consumers also demand reliability and resiliency attributes found as mainstream in enterprise-class storage systems. PFSS uses the industry-leading Spectrum Scale Erasure Code Edition software. More fun times with high-performance storage options.
HPE Slingshot High Performance Network Overview &
HPE Slingshot - Designing, Configuring and Managing
Both sessions discussed the HPE Slingshot dragonfly topology interconnect capabilities for Cray supercomputing-class systems. Slingshot is not InfiniBand but Ethernet with all the benefits of that standard. (Again, the war between IB and Ethernet rekindles.) Slingshot is purpose built using an HPE-custom NIC and a switch silicon fabric design to tackle the Ethernet challenges that prevented it from being used for serious HPC and AI computing. The HPE Slingshot product technology can apply to HPE Cray EX (capability) supercomputers and volume (capacity) HPC clusters on HPE Apollo and ProLiant clusters.
You cannot mention HPE without saying GreenLake at least once. Did you know that GreenLake now supports HPC infrastructure? It does with standard architectures that support HPC workflows and is supported by GreenLake Central. GreenLake for HPC now offers GPU metering, chargeback/showback, and multicloud support with full hybrid integration.
And now another plug for artificial intelligence (AI). Many sessions in the AI track included HPC content and vice versa. HPC + AI is the same coin, only different sides of the coin, and a cornerstone of academic research, government computing, and industry innovation. Check out this session: "(EDGE02) AI at the Edge," that dove into Artificial Intelligence (AI) solutions focused on Edge use cases. These application workloads include signal intelligence, computer vision, GPU-accelerated analytics, natural language processing (NLP), and cybersecurity.
Here's an honorable session mention. "The 12 Laws of the Innovator - Building Great Leaders" was about the innovation pipeline from idea to reality. This pipeline depends on core personal and professional skills beyond technical knowledge. Robert Christiansen shared--in an entertaining fashion--how the HPE CTO Office coaches and mentors eager innovators along their leadership journey. Another area that WWT and HPE share is our willingness and drive for relentless customer innovation.
Stay tuned and check out a forthcoming article about HPC + AI cross-cutting observations of particular interest from HPE Aspire 2022.
In conclusion, I hope HPE Aspire 2022 continues the positive trend of in-person technical meetings this year. It was crucial to discuss possibilities with the HPC community.
Finally, come visit the HPE Discover event (June 28-30th, 2022 at The Venetian in Las Vegas, NV) as Chris Black and I will present on "HPC-as-a-Service" and how high performance computing is seeing more robust adoption than ever thanks to the rise of this offering from the major cloud providers and top HPC providers like Hewlett Packard Enterprise.