Contributed Talk - Splinter Computation

Friday, 25 September 2020, 15:20   (virtual room B)

Performance Portability in Practice: Challenges and Successes with K-Athena (MHD) & Parthenon (AMR)

Forrest Glines, Philipp Grete, Brian O'Shea
Michigan State University

While the fundamental architecture of supercomputers saw little change for a long time, different novel architectures emerged in recent years on the way towards the exascale era. These include many-core processors such as the Intel Xeon Phi, FPGA accelerator cards, or GPUs for general purpose computing. Until recently, each new architecture can require a separate, non-trivial rewrite of a simulation code. To circumvent this, a current goal in computational science is the creation of parallel programming paradigms for writing performance portable code: code that can run efficiently at high performance on many different supercomputer architectures. We combined Athena++, an existing radiation general relativity magnetohydrodynamics CPU code, with Kokkos, an on-node parallel programming model, into K-Athena to allow simulations to run efficiently on both CPUs and GPUs using a single codebase. I will introduce performance portability approaches, and present profiling and scaling results (up to 24,576 GPUs on the Summit supercomputer) for multiple architecture including Intel Skylake CPUs, Intel Xeon Phis, and NVidia Volta V100 GPUs with an emphasis on achieving maximum performance on different platforms in practice. In addition, I will introduce Parthenon – a new performance portable adaptive mesh refinement (AMR) framework – and highlight specific challenges and their solutions when executing all AMR-related algorithms on GPUs (i.e., without explicit data transfers between host and device memory).