Skip to main content

Aurora (Intel-HPE): Website

Aurora, Argonne’s first exascale computer, features Intel’s Data Center GPU Max Series GPUs and Intel XEON Max Series CPUs with HBM, a CPU-GPU interconnect (PCI-e), the HPE Slingshot fabric system interconnect, and a Cray EX platform. Aurora features several technological innovations and includes a revolutionary I/O system – the Distributed Asynchronous Object Store (DAOS) – to support new types of machine learning and data science workloads alongside traditional modeling and simulation workloads.   

The Aurora software stack includes the HPE HPCM software stack, the Intel oneAPI software development kit, and data and learning frameworks. Supported programming models include MPI, Intel OneAPI, OpenMP, SYCL/DPC++, Kokkos, RAJA, and others. HIP is supported via ChipStar/CHIP-SPV. 

Aurora System Configuration 
Architecture  Intel/HPE 
Node   2 x 4th Gen Intel Xeon Max Series CPUs with HBM 

6x Intel Data Center GPU Max Series 

 

Node Count  10,624 
GPU Architecture  Intel Data Center GPU Max Series; Tile based, chiplets, HBM stack, Foveros 3d integration 
Interconnect   HPE Slingshot 11; Dragonfly topology with adaptive routing 

25.6 TB/s per switch, from 64-200 GB ports (25GB/s per direction) 

File System  230 PB, 31 TB/s (DAOS) 
Peak Performance  ≥ 2EF DP peak  Aggregate System Memory  20PB 
Aggregate DDR5 Memory  10.6PB  Aggregate CPU HBM  1.3PB 
Aggregate GPU HBM  8.1PB  Node Memory Architecture  Unified memory architecture, RAMBO 

The most recent information on Aurora can be found at https://docs.alcf.anl.gov/aurora/machine-overview/

Frontier (Cray-HPE): Website

Frontier is a HPE Cray EX supercomputer located at the Oak Ridge Leadership Computing Facility. With a theoretical peak double-precision performance of approximately 2 exaflops (2 quintillion calculations per second), it is the fastest system in the world for a wide range of traditional computational science applications. The system has 77 Olympus rack HPE cabinets, each with 128 AMD compute nodes, and a total of 9,856 AMD compute nodes.  

Each Frontier compute node consists of a single 64-core AMD 3rd Gen EPYC CPU and 512 GB of DDR4 memory. Each node also contains [4x] AMD MI250X, each with 2 Graphics Compute Dies (GCDs) for a total of 8 GCDs per node. Each GCD has 64 GB of high-bandwidth memory (HBM2E) for a total of 512 GB of HBM memory per node.

Frontier System Configuration 
Architecture  HPE / AMD 
Node  1 3rd Gen AMD EPYC CPU
4 AMD Instinct 250X GPUs 
Node Count  9,856 
GPU link  AMD Infinity Fabric 
Interconnect:   4x HPE Slingshot NICs providing 100 GB/s network bandwidth 
File system:  700 PB, Lustre center-wide file system + 11 PB Flash 
Peak Performance    ≥ 2EF DP peak  

Further details may be found https://docs.olcf.ornl.gov/systems/frontier_user_guide.html