swyft: Truncated Marginal Neural Ratio Estimation in Python

a


Summary
Parametric stochastic numerical simulators are ubiquitous in science. They model observed phenomena by mapping a parametric representation of simulation conditions to a hypothetical observation-effectively sampling from a probability distribution over observational data known as the likelihood. Simulators are advantageous because they easily encode relevant scientific knowledge. Simulation-based inference (SBI) is a machine learning technique which applies a simulator, a fitted statistical surrogate model, and a set of prior beliefs to estimate a probabilistic description of the parameters which plausibly generated some observational data. This description of parameters is known as the posterior and it is the end-product of Bayesian inference.
Our package swyft implements a specific, simulation-efficient SBI method called Truncated Marginal Neural Ratio Estimation (TMNRE) ; it estimates the likelihoodto-evidence ratio to approximate the posterior, as in Hermans et al. (2020). swyft (Miller et al., 2020) provides a collection of tools to simulate and store data, locally or in a distributed computing setting, and perform (marginalized) simulation-based Bayesian inference. It produces ready-to-publish plots that demonstrate the calibration of the posterior estimate along with the posterior itself.

Statement of Need
Estimating the posterior can be prohibitively expensive for complex data and slow simulators. Part of the reason is the sequential nature of likelihood-based Markov chain Monte-Carlo (Hastings, 1970;Metropolis et al., 1953). In contrast, SBI parallelizes simulation in most circumstances, thereby reducing the practical waiting time for results. In pursuit of further simulation efficiency, Miller et al. (2021) argue that fitting the joint posterior for all parameters is unnecessary when a marginal estimate of the posterior will suffice. Some SBI methods are amortized, whereby the statistical model is fit to estimate posteriors for all possible observations simultaneously. While amortization enables necessary posterior calibration checks, like expected coverage probability (Hermans et al., 2021;Miller et al., 2021), it is more efficient to fit the model on only a subset of the parameters that could have plausibly generated the observation.
swyft satisfies necessary requirements, like estimating the marginal posteriors of interest and enabling posterior calibration checks, while taking a lean approach to avoid all unnecessary simulation. In this pursuit, swyft truncates the prior to regions relevant for given observational data and reuses compatible existing simulations. swyft automates irksome matters like distributed computing and data storage with dask (Dask Development Team, 2016) and zarr (Miles et al., 2021) respectively. swyft is designed to: 1. Estimate arbitrary marginal posteriors, i.e., the posterior over parameters of interest, marginalizing over nuisance parameters. 2. Perform targeted inference by truncating the prior distribution with an indicator function estimated in a sequence of inferences. 3. Estimate the expected coverage probability of fully amortized SBI posteriors and locally amortized posteriors that are limited to truncated regions. 4. Seamlessly reuse simulations from previous analyses by drawing already-simulated data first via a flexible storage solution. 5. Integrate advanced distribution and storage tools to simplify application of complex simulators.
Although there is a rich ecosystem of SBI implementations, TMNRE did not naturally fit in an existing framework since it requires parallel estimation of marginal posteriors and a truncated prior. swyft does the parallel training of the ratio using another dimension in a PyTorch tensor and created a custom truncated prior data structure to overcome these challenges. swyft aims to meet the ever-increasing demand for efficient and testable Bayesian inference in fields like physics, cosmology, and astronomy by implementing TMNRE together with practical distributed computing and storage tools.

Existing research with swyft
The software package has enabled inference on dark matter substructure in strongly lensed galaxies (Coogan et al., 2020), estimated cosmological parameters from cosmic microwave background simulation data , and was cited in a white paper laying out a vision for astropartical physics research during the next decade (Batista et al., 2021). Ongoing work with swyft aims to reduce the response time to gravitational wave triggers from LIGO-Virgo by estimating the marginal posterior with unprecedented speed. There is an existing proof-of-concept by Delaunoy et al. (2020) although the swyft software package was not applied. Generally, speeding up gravitational wave inference using simulation-based inference is an active area of research (Chua & Vallisneri, 2020;Dax et al., 2021;Gabbard et al., 2022). In another work-in-progress, swyft helps to characterize the magnetohydrodynamics of binary neutron star mergers using multi-messenger gravitational and electrodynamic data where marginalization would be impossible with likelihood-based methods.

Related theoretical work
There is a long tradition of likelihood-free inference, also known as Approximate Bayesian Computation (ABC), going back to as early as the 1980s (Beaumont et al., 2009;Diggle & Gratton, 1984;Rubin, 1984;Tavaré et al., 1997;Toni et al., 2009). Traditional techniques use Monte-Carlo rejection sampling and are summarized by Sisson et al. (2018) and Karabatsos & Leisen (2018). We track the development of classifiers for the estimation of likelihood ratios to a few references. Cranmer et al. (2015) compared the ratio between the likelihood of a freely varying parameter and a fixed reference value for frequentist inference. Pham et al. (2014) estimated the ratio between likelihoods for Markov chain Monte-Carlo sampling. Thomas et al. (2016) and Gutmann et al. (2018) introduced the framework which allows for likelihood-to-evidence ratio estimation. Like swyft, Blum & François (2010) proposed to truncate the prior for sampling but do so within an ABC scheme.
Modern SBI is a quickly evolving field that has several techniques under development (Cranmer et al., 2020). Neural network-based methods are categorized according to the term they approximate in Bayes' formula. swyft is a method which approximates the likelihood-toevidence ratio p(x|θ) p(x) where θ are the parameters and x is the observational data. Works by Hermans et al. (2020), Durkan et al. (2020), and  are closely related to swyft as they also approximate the likelihood-to-evidence ratio. Like swyft,  estimate marginal posteriors, but unlike swyft, they attempt to amortize over all possible marginals with a single neural network. Other methods estimate the posterior directly Greenberg et al., 2019;Lueckmann et al., 2017;Papamakarios & Murray, 2016) or the likelihood itself (Lueckmann et al., 2019;Papamakarios et al., 2019).

Related software
swyft is unique because it implements TMNRE and a method for simulation reuse. It also offers sophisticated distributed simulation and storage tools coupled directly to the software. We briefly discuss the alternatives in the thriving ecosystem of SBI software packages. sbi (Tejero-Cantero et al., 2020) features a selection of modern neural SBI algorithms. It is accompanied by a benchmark sbibm (Lueckmann et al., 2021) which tests those methods against a set of tractable toy problems. pydelfi (Alsing, 2019) estimates the likelihood of a learned summary statistic (Alsing et al., 2018)-swyft users should pay special attention to this repository since it can also project out nuisance parameters . carl (Louppe et al., 2016) uses a classifier to estimate the likelihood ratio as Cranmer et al. (2015) did and hypothesis (Hermans, 2019) includes several toy simulators.

Description of software
swyft implements Marginal Neural Ratio Estimation (MNRE), a method which trains an amortized likelihood-to-evidence ratio estimator for any marginal posterior of interest. swyft makes it easy to estimate a set of marginals in parallel, e.g., for a corner plot. To use swyft, the operator must provide a quantification of prior beliefs, a python-callable or bash-scriptable simulator, and an observation-of-interest.
Performing TMNRE with swyft, by restricting simulation to a truncated prior region, is simple and demonstrated in the documentation. Constructing these truncated regions can be done manually or based on a previous inference. Routines are provided for all necessary plots and for calculating the expected coverage probability of a given likelihood-to-evidence ratio estimator. This calculation is essential as a sanity check to determine whether the approximate posterior is calibrated.
The machine learning aspects of swyft are implemented in PyTorch (Paszke et al., 2019) while the truncated prior is implemented within numpy (Harris et al., 2020). Storing previously simulated data for reuse in later analyses is accomplished with zarr (Miles et al., 2021) and parallelization of simulation is achieved with dask (Dask Development Team, 2016). swyft has other important dependencies, namely scipy , seaborn (Waskom, 2021), matplotlib (Hunter, 2007), pandas (McKinney, 2010;Reback et al., 2021), and jupyter (Kluyver et al., 2016). and innovation programme . This work was also supported by the Netherlands eScience Center and SURF under grant number ETEC.2019.018. This work was carried out on the Dutch national e-infrastructure with the support of SURF Cooperative. We also would like to thank SURF for providing computational resources via the EINF-1194 grant.