HylleraasMD: Massively parallel hybrid particle-field molecular dynamics in Python

Molecular dynamics (MD) is a computational methodology in which the dynamical behavior of systems of interacting atoms and molecules is investigated by integrating the corresponding classical equations of motion. The analysis of the molecular trajectories yields an incredibly powerful computational microscope with atomic resolution. While prominent examples of molecular dynamics involving all-atom models exist


Summary
Molecular dynamics (MD) is a computational methodology in which the dynamical behavior of systems of interacting atoms and molecules is investigated by integrating the corresponding classical equations of motion. The analysis of the molecular trajectories yields an incredibly powerful computational microscope with atomic resolution. While prominent examples of molecular dynamics involving all-atom models exist, many systems operate on time-and lengths scales too large, precluding the use of such an approach. The intrinsic complexity of biological soft-matter systems has necessitated the development of coarse-grained (CG) MD models wherein groups of atoms are treated as individual entities. To probe experimentally relevant length-(nm-µm) and time-(ps-ms) scales, further reduction of computational complexity may be warranted through the removal of explicit particle-particle interactions in favor of particle-density field interactions. Such hybrid particle-field (hPF) models recast the interactions between particle pairs into a system of free particles interacting with an external potential dependent on the density, in analogy with self-consistent field theories.
HylleraasMD (named after our affiliate centre, the Hylleraas Centre for Quantum Molecular Sciences) (HyMD) is a Python package capable of highly parallel hPF-MD simulations of a wide range of surfactants and other biological systems in a CG representation. At present, it is the only open-source implementation of the hPF formalism freely available to computational researchers.

Theoretical background
Hybrid particle-field methods are computationally efficient schemes for simulating mesoscale macromolecular assemblies (Milano & Kawakatsu, 2009). Ordinary MD involves, at every integration step, the calculation of computationally expensive double sum over all particle pairs. Despite numerous clever decompositions of the simulation box, which reduces the formal scaling of this (Frenkel & Smit, 2001), it remains the major computational bottleneck. Hybrid particle-field simulations forego this step completely, instead indirectly coupling particles only through an interaction energy functional depending on a slowly varying density field. Exploiting the slow time evolution of the density fields, it is possible to employ multiple time-step algorithms which only seldom impart field impulses on individual particles. Beyond this fundamentally more efficient setup of hPF models, the major advantage over particle-particle approaches is the intrinsically embarrassingly parallel nature of a large portion of the calculations; inter-MPI communication only being necessary whenever the density field is updated. This is traditionally done every tens-several hundreds of MD steps. Accordingly, the hPF methodology has been successfully applied to polymer melts (Wu et al., 2020(Wu et al., , 2021, different phases of lipids and surfactants (Bore et al., 2019Carrer et al., 2020;De Nicola et al., 2015;De Nicola et al., 2021;Ledum et al., 2020), and charged surfactants and polypeptides Kolli et al., 2018;Schäfer et al., 2020).

Statement of need
Elucidating fundamental aspects of the complexity of biological systems often requires atomically resolved mesoscale simulations. One crucial example is the large-scale macromolecular selfassembly of lipids and proteins into eukaryotic cell membranes or intracellular organelle structures. Some such systems are computationally accessible today at the CG-MD level, but this is far from routine and not achievable for the broad scientific community. Hybrid particle-field models allow in principle exploration of such systems at near-atomistic resolution, with good chemical accuracy.
Since the hPF scheme was proposed , two main codes have been used to perform such simulations. HyMD is, to date, the only available open-source hPF simulation software. Furthermore, through a recent reformulation of the hPF formalism , which decouples the computational mesh grid and the length scale of the particle-grid interaction, a new Hamiltonian hPF (HhPF) method has emerged. Currently, HyMD constitutes the only software for performing HhPF simulations, open-source or otherwise. This new scheme has a number of advantages over canonical hPF, such as rigorous energy and momentum conservation, rotationally and translationally invariant forces, and a tunable coarse-graining length scale representing the size extent of particles. Additionally, the new formulation naturally lends itself to calculations in reciprocal space, enabling us to take advantage of highly optimized FFT algorithms.

Features
Apart from a minimal set of high-performance Fortran kernels, the entirety of HyMD is written in Python. This makes extending the software with new functionality easy, enabling fast prototyping of new features. The key components of HyMD include: • Standard hPF interaction functionals, with the option to specify any (local or otherwise) functional, which is automatically handled through symbolic differentiation and numpy vectorization. • Density filtering (with any user-provided filter function), enabling canonical hPF or HhPF simulations with tunable coarse-graining scale which can be changed on-the-fly. • Optional explicit electrostatic interactions through our custom long-range Particle-Mesh Ewald implementation. • All standard intramolecular bonded interactions, including stretching, bending, torsional potentials, and combined bending-torsional potentials describing peptide backbone conformations . • Topological reconstruction of permanent peptide chain backbone dipoles, enabling realistic protein conformational simulations (Alemani et al., 2010;Bore et al., 2018;Cascella et al., 2008).
To probe experimentally relevant structures, parallelization through mpi4py is used. A 2D pencil grid domain decomposition is employed, separating spatial areas of the simulation box across MPI ranks. The highly scalable PFFT (Pippig, 2013) library is used for reciprocal space calculations, as a backend for the PMESH (Feng et al., 2017) package through which we handle the particle-mesh part of the code. A specialized HDF5 file format for MD trajectories (Buyl et al., 2014) is used to enable massively parallel file IO while maintaining an easy structural organization of quantities calculated for storage.