AMReX: a framework for block-structured adaptive mesh refinement

Author(s): Zhang, Weiqun; Almgren, Ann; Beckner, Vince; Bell, John; Blaschke, Johannes; Chan, Cy; Day, Marcus; Friesen, Brian; Gott, Kevin; Graves, Daniel; Katz, Max; Myers, Andrew; Nguyen, Tan; Nonaka, Andrew; Rosso, Michele; Williams, Samuel; Zingale, Michael


Summary
AMReX is a C++ software framework that supports the development of block-structured adaptive mesh refinement (AMR) algorithms for solving systems of partial differential equations (PDEs) with complex boundary conditions on current and emerging architectures.
Block-structured AMR discretization provides the basis for the temporal and spatial strategy for a large number of applications; see, e.g., (A. S. Almgren, Bell, Colella, Howell, & Welcome, 1998;J. Bell, Berger, Saltzman, & Welcome, 1994;M. J. Berger & Colella, 1989; M. J. Berger & Oliger, 1984;Pember et al., 1998) for some of the earliest blockstructured AMR work. There are also a number of block-structured and octree AMR software frameworks publicly available; see ("AMR Resources Web page," n.d.) for links to many of them.
AMR reduces the computational cost and memory footprint compared to a uniform mesh while preserving the local descriptions of different physical processes in complex multiphysics algorithms. Current AMReX-based application codes span a number of areas, including atmospheric modeling, astrophysics, combustion, cosmology, fluctuating hydrodynamics, multiphase flows, and particle accelerators. In particular, the AMReX-Astro GitHub repository holds a number of astrophysical modeling tools based on AMReX (Zingale et al., 2018). The origins of AMReX trace back to the BoxLib (W. Zhang et al., 2016) software framework.
AMReX supports a number of different time-stepping strategies and spatial discretizations. Solution strategies supported by AMReX range from level-by-level approaches (with or without subcycling in time) with multilevel synchronization to full-hierarchy approaches, and any combination thereof. User-defined kernels that operate on patches of data can be written in C++ or Fortran; there is also a Fortran-interface functionality which wraps the core C++ data structures and operations in Fortran wrappers so that an application code based on AMReX can be written entirely in Fortran.
AMReX developers believe that interoperability is an important feature of sustainable software. AMReX has examples of interfaces to other popular software packages such as SUNDIALS (Hindmarsh et al., 2005), PETSc (Balay et al., 2019) and hypre (Balay et al., 2019), and is part of the 2018 xSDK ("xSDK Version 0.4.0 Web page," n.d.) software release thus installable with Spack.

Mesh and Particle Data
AMReX supplies data containers and iterators for mesh-based fields and particle data. The mesh-based data can be defined on cell centers, cell faces, or cell corners (nodes). Coordinate systems include 1D Cartesian or spherical; 2D Cartesian or cylindrical (r-z); and 3D Cartesian.
AMReX provides data structures and iterators for performing data-parallel particle simulations. The approach is particularly suited to particles that interact with data defined on a (possibly adaptive) block-structured hierarchy of meshes. Example applications include those that use Particle-in-Cell (PIC) methods, Lagrangian tracers, or solid particles that exchange momentum with the surrounding fluid through drag forces. AMReX's particle implementation allows users flexibility in specifying how the particle data is laid out in memory and in choosing how to optimize parallel communication of particle data.

Complex Geometries
AMReX provides support for discretizing complex geometries using the cut cell / embedded boundary approach. This requires additional data structures for holding face apertures and normals as well as volume fractions. Support for operations on the mesh hierarchy including cut cells is enabled through the use of specialized discretizations at and near cut cells, and masks to ensure that only values in the valid domain are computed. Examples are provided in the tutorials.

Parallelism
AMReX's GPU strategy focuses on providing performant GPU support with minimal changes to AMReX-based application codes and maximum flexibility. This allows application teams to get running on GPUs quickly while allowing long term performance tuning and programming model selection. AMReX currently uses CUDA for GPUs, but application teams can use CUDA, CUDA Fortran, OpenACC, or OpenMP in their individual codes. AMReX will support non-CUDA strategies as appropriate.
When running on CPUs, AMReX uses an MPI+X strategy where the X threads are used to perform parallelization techniques like tiling. The most common X as of this writing is OpenMP but AMReX is rapidly evolving to work effectively on GPUs. On GPUs, AM-ReX requires CUDA and can be further combined with other parallel GPU languages, including OpenACC and OpenMP, to control the offloading of subroutines to the GPU. This MPI+CUDA+X GPU strategy has been developed to give users the maximum flexibility to find the best combination of portability, readability and performance for their applications.

Asynchronous Iterators and Fork-Join Support
AMReX includes a runtime system that can execute asynchronous AMReX-based applications efficiently on large-scale systems. The runtime system constructs a task dependency graph for the whole coarse time step and executes it asynchronously to the completion of the step. There is also support for more user-specific algorithms such as asynchronous filling of ghost cells across multiple ranks, including interpolation of data in space and time.
In addition, AMReX has support for fork-join functionality. During a run of an AMReXbased application, the user can divide the MPI ranks into subgroups (i.e., fork) and assign each subgroup an independent task to compute in parallel with each other. After all of the forked child tasks complete, they synchronize (i.e., join), and the parent task continues execution as before. The fork-join operation can also be invoked in a nested fashion, creating a hierarchy of fork-join operations, where each fork further subdivides the ranks of a task into child tasks. This approach enables heterogeneous computation and reduces the strong scaling penalty for operations with less inherent parallelism or with large communication overheads.

Linear Solvers
AMReX includes native linear solvers for parabolic and elliptic equations. Solution procedures include geometric multigrid (Briggs, Henson, & McCormick, 2000) and BiCGStab iterative solvers; interfaces to external hypre and PETSc solvers are also provided. The linear solvers operate on regular mesh data as well as data with cut cells.

I/O and Post-processing
AMReX has native I/O for checkpointing and for reading and writing plotfiles for postprocessing analysis or visualization. AMReX also supplies interfaces to HDF5. The AMReX plotfile format is supported by VisIt (Childs et al., 2012), Paraview (Ahrens, Geveci, & Law, 2005), and yt (Turk et al., 2011). AMReX also has linkages to external routines through both Conduit ("Conduit Web page," n.d.) and SENSEI ("SENSEI Web page," n.d.).

Documentation, Tutorials and Profiling Tools
Extensive documentation of core AMReX functionality is available online, and many of the application codes based on AMReX are publicly available as well. Smaller examples of using AMReX for building application codes are provided in the AMReX Tutorials section. Examples include a Particle-in-Cell (PIC) code, a compressible Navier-Stokes solver in complex geometry, advection-diffusion solvers, support for spectral deferred corrections time-stepping, and much more.
AMReX-based application codes can be instrumented using AMReX-specific performance profiling tools that take into account the hierarchical nature of the mesh in most AMReXbased applications. These codes can be instrumented for varying levels of profiling detail.