Summit

At the OLCF, the 200-petaflop system called Summit is available to INCITE users in 2021. The Summit system has very powerful, large memory nodes that are most effective for those applications that utilize the GPUs efficiently. The machine also has node-local, non-volatile memory (NV) that can be used to increase I/O bandwidth or provide expanded local storage for applications. The peak node performance is 42 teraflops (TF) per node with 512 gigabytes (GiB) of DDR4 memory, 96 GiB of HBM2 memory, and 1,400 GiB of available non-volatile memory.

Summit System Configuration
Architecture: IBM
Node: 2 IBM Power9 processors + 6 NVIDIA Volta GPUs
Compute Nodes: 4,608 hybrid nodes
Node performance: 42 TF
Memory/node: 512 GiB DDR4 + 96 GiB High Bandwidth Memory (HBM2)
Available NV memory per node: 1,400 GiB
GPU link: NVLink 2.0
Total system memory: >10 PB (DDR4 + HBM2 + Non-volatile)
Interconnect: Non-blocking Fat Tree
Interconnect Bandwidth

(Node injection bandwidth):

Dual Port EDR-IB (25 GB/second)
File system: 250 PB, 2.5 Terabytes/s, GPFS
Peak Speed: 200 PF
Top500 Profile: Summit

Summit uses IBM’s Spectrum Scale™ file system.  It has 250 petabytes (PB) of capacity and bandwidth of up to 2.5 terabytes/second (TB/s).  In addition, each node has 1.4 TiB of available non-volatile memory that provides high-speed local storage as well as serving as a burst buffer in front of the file system.  Also, all OLCF users have access to the HPSS data archive, the Rhea pre- and post-processing cluster and to the EVEREST high-resolution visualization facility. All of these resources are available through high-performance networks including ESnet’s recently upgraded 100 Gb/s links.

For more information about any of the OLCF resources, please visit https://www.olcf.ornl.gov/olcf-resources/

Summit Website

Frontier (Available in 2023)

Three-year proposals in 2021 may request access to the Frontier system in their final year. The Frontier system will be based on Cray’s new Shasta architecture and Slingshot interconnect with high-performance AMD EPYC CPU and Radeon Instinct GPU technology. The new accelerator-centric compute blades will support a 4:1 GPU-to-CPU ratio connected with high-speed links and coherent memory between them within the node. The peak performance is expected to be over 1.5 exaflops (EF).

For the 2021 INCITE proposal submission, Frontier allocations in 2023 will be requested in equivalent “Summit node-hours.” For planning purposes for this cycle, we conservatively expect nearly 133 million Summit-equivalent node-hours to be allocated on Frontier per year and that the average INCITE project will be awarded approximately 3-4 million Summit-equivalent node-hours in 2023. Awarded three-year projects will be required to revisit their Summit-equivalent node-hour request for Frontier and the computational readiness for Frontier will be re-evaluated in future renewals.

Further details may be found https://www.olcf.ornl.gov/frontier/

Theta

Theta, the ALCF’s Cray XC40 supercomputer, is equipped with 281,088 cores, 70 TB of high-bandwidth MCDRAM, 843 TB of DDR4 memory, 562 TB on SSDs, and has a peak performance of 11.69 PF.

Theta’s 4,392 compute nodes have an Intel “Knights Landing” Xeon Phi processor containing 64 cores, each with four hardware threads, and 16 GiB of high-bandwidth MCDRAM, 192 GiB of DDR4 memory, and a 128-GiB SSD. Theta uses Cray’s high-performance Aries network in a Dragonfly configuration. The Xeon Phi supports improved vectorization using AVX-512 SIMD instructions.

Theta System Configuration
Architecture: Cray XC40
Processor: Intel “Knights Landing” Xeon Phi
Nodes: 4,392 compute nodes
Cores/node: 64
Total cores: 281,088 Cores
HW threads/core: 4
HW threads/node: 256
Memory/node: 16GiB MCDRAM + 192 GiB DDR4 + 128 GiB SSD
Memory/core: 250 MiB MCDRAM + 3 GiB DDR4
Interconnect: Aries (Dragonfly)
Speed: 11.69 PF
Top500 Profile: Theta

Users of Theta will have access to a 10-PB Lustre file system with 210 GB/s bandwidth in addition to 27 PB of GPFS file storage, which is connected Cooley, the ALCF’s data analysis and visualization cluster, at an aggregate speed of 330 GB/s. Additionally, ALCF is expecting a new Global filesystem online in 2020 with a target size of 150 PB and 1 TB/s for GFS.

For more information about any of the ALCF resources, please visit http://www.alcf.anl.gov/computing-resources.

Theta Website

Aurora (Available in 2023)

The replacement for Theta is Aurora, a system with sustained performance of >= 1 EF, over 10 PB of aggregate memory. The compute node will have 2 Intel Xeon scalable processors (“Sapphire Rapids”) and 6 Xe arch-based GP-GPUs (“Ponte Vecchio”). There will be a unified memory architecture across the CPUs and GPUs and all-to-all connectivity within the node. The system interconnect is the Cray Slingshot Dragon fly topology with adaptive routing and 8 fabric endpoints per node. Aurora’s high performance storage system will be Distributed Asynchronous Object Storage (DAOS) of >=230 PB at an access rate of >= 25 TB/s. Aurora will also connect to the ALCF global file systems, which provide over 200 PB of storage, with an access rate of ~1.3 TB/s.

The expected software stack includes the Cray Shasta software stacks, Intel software, and data and learning frameworks. Supported programming models include MPI, Intel OneAPI, OpenMP, SYCL/DPC++., Kokkos, Raja, and others.

For the 2021 INCITE proposal submission, Aurora allocations in 2023 will be requested in equivalent “Theta node-hours.” For planning purposes for this cycle, we conservatively expect nearly 1.78 billion Theta-equivalent node-hours to be allocated on Aurora per year and that the average INCITE project will be awarded approximately 40-50 million Theta-equivalent node-hours in 2023. Awarded three-year projects will be required to revisit their Theta-equivalent node-hour request for Aurora and the computational readiness for Aurora will be re-evaluated in future renewals.

The most recent public information on Aurora can be found at https://aurora.alcf.anl.gov.