pytreegrav: A fast Python gravity solver

Gravity is important in a wide variety of science problems. In particular, questions in astrophysics nearly all involve gravity, and can have large (≫ 104) numbers of gravitating masses, such as the stars in a cluster or galaxy, or the discrete fluid elements in a hydrodynamics simulation. Often the gravitational field of such a large number of masses can be too computationally expensive to compute by directly summing the contribution of every single element at every point of interest.


Summary
Gravity is important in a wide variety of science problems. In particular, questions in astrophysics nearly all involve gravity, and can have large (≫ 10 4 ) numbers of gravitating masses, such as the stars in a cluster or galaxy, or the discrete fluid elements in a hydrodynamics simulation. Often the gravitational field of such a large number of masses can be too computationally expensive to compute by directly summing the contribution of every single element at every point of interest.
pytreegrav is a multi-method Python package for computing gravitational fields and potentials. It includes an exact direct-summation ("brute force") solver and a fast, approximate tree-based method that can be orders of magnitude faster than the naïve method. It can compute fields and potentials from arbitrary particle distributions at arbitrary points, with arbitrary softening/smoothing lengths, and is parallelized with OpenMP.

Statement of need
The problem addressed by pytreegrav is the following: given an arbitrary set of "source" masses m i with 3D coordinates x i , and optionally each having a finite spatial extent h i (the softening radius), one would like to compute the gravitational potential Φ and/or the gravitational field g at an arbitrary set of "target" points in space y i . A common application for this is N-body simulations (wherein y i = x i ). It is also often useful for analyzing simulation results after the fact -Φ and g are sometimes not saved in simulation outputs, and even when they are it is often useful to analyze the gravitational interactions between specific subsets of the mass elements in the simulation. Computing g is also important for generating equilibrium initial conditions for N-body simulations (Volker Springel & White, 1999;Yurin & Springel, 2014), and for identifying interesting gravitationally-bound structures such as halos, star clusters, and giant molecular clouds (Behroozi et al., 2013;Grudić et al., 2018;Guszejnov et al., 2020).
Many gravity simulation codes (or multi-physics simulation codes including gravity) have been written that address the problem of gravity computation in a variety of ways for their own internal purposes (Aarseth, 2003;Dehnen & Read, 2011). However, pykdgrav (the precursor of pytreegrav) was the first Python package to offer a generic, modular, triviallyinstallable gravity solver that could be easily integrated into any other Python code, using the fast, approximate tree-based Barnes & Hut (1986) method to be practical for large particle numbers. pykdgrav used a KD-tree implementation accelerated with numba (Lam et al., 2015) to achieve high performance in the potential/field evaluation, however the prerequisite tree-building step had relatively high overhead and a very large memory footprint, because the entire dataset was redundantly stored at every level in the tree hierarchy. This made it difficult to scale to various practical research problems, such as analyzing high-resolution galaxy simulations (Gurvich et al., 2020). pytreegrav is a full refactor of pykdgrav that addresses these shortcomings with a new octree implementation, with drastically reduced tree-build time and memory footprint, and a more efficient non-recursive tree traversal for field summation. This makes it suitable for post-processing datasets from state-of-the-art astrophysics simulations, with upwards of 10 8 particles in the region of interest.

Methods
pytreegrav can compute Φ and g using one of two methods: by "brute force" (explcitly summing the field of every particle, which is exact to machine precision), or using the fast, approximate Barnes & Hut (1986) tree-based method (which is approximate, but much faster for large particle numbers). In N -body problems where the fields at all particle positions must be known, the cost of the brute-force method scales as ∝ N 2 , while the cost of the tree-based method scales less steeply, as ∝ N log N . The brute-force methods are often fastest for small (< 10 3 particle) point sets because they lack the overheads of tree construction and traversal, while the tree-based methods will typically be faster for larger datasets because they reduce the number of floating-point operations required. Both methods are optimized with the numba LLVM JIT compiler (Lam et al., 2015), and the basic Accel and Potential front-end functions will automatically choose the method is likely to be faster, based on this heuristic crossover point of 10 3 particles. Both methods can also optionally be parallelized with OpenMP, via the numba @njit(parallel=True) interface.
The implementation of the tree build and tree-based field summation largely follows that of GADGET-2 (V. Springel, 2005). Starting with an initial cube enclosing all particles, particles are inserted into the tree one at a time. Nodes are divided into 8 subnodes until each subnode contains at most one particle. The indices of the 8 subnodes of each node are stored for an initial recursive traversal of the completed tree, but an optimized tree traversal only needs to know the first subnode (if the node is to be refined) and the index of the next branch of the tree (if the field due to the node is summed directly), so these indices are recorded in the initial recursive tree traversal, and the 8 explicit subnode indices are then deleted, saving memory and removing any empty nodes from consideration. Once these "next branch" and "first subnode" indices are known, the tree field summations can be done in a single while loop with no recursive function calls, which generally improves performance and memory usage.
The field summation itself uses the Barnes & Hut (1986) geometric opening criterion, with improvements suggested by Dubinski (1996): for a node of side length L with centre of mass located at distance r from the target point, its contribution is summed using the monopole approximation (treating the whole node as a point mass) only if r > L/Θ + δ, where Θ = 0.7 by default (giving ∼ 1% RMS error in g), δ is the distance from the node's geometric center to its center of mass. If the conditions for approximation are not satisfied, the node's subnodes are considered in turn, until the field contribution of all mass within the node is summed. pytreegrav supports gravitational softening by assuming the mass distribution of each particle takes the form of a standard M4 cubic spline kernel, which is zero beyond the softening radius h (outside which the field reduces to that of a point mass). Explicit expressions for this form of the softened gravitational potential and field are given in Hopkins (2015). h is allowed to vary from particle to particle, and when summing the field the larger of the source or the target softening is used (symmetrizing the force between overlapping particles). When softenings are nonzero, the largest softening h max of all particles in a node is stored, and a node is always opened in the field summation if r < 0.6L + max (h target , h max ) + δ, where h target is the softening of the target particle where the field is being summed. This ensures that any interactions between physically-overlapping particles are summed directly with the softening kernel.