starry_process: Interpretable Gaussian processes for stellar light curves

In this note we present the starry_process code, which implements an interpretable Gaussian process (GP) for modeling variability in stellar light curves. As dark starspots rotate in and out of view, the total flux received from a distant star will change over time. Unresolved flux time series therefore encode information about the spatial structure of features on the stellar surface. The starry_process software package allows one to easily model the flux variability due to starspots, whether one is interested in understanding the properties of these spots or marginalizing over the stellar variability when it is treated as a nuisance signal. The main difference between the GP implemented here and typical GPs used to model stellar variability is the explicit dependence of our GP on physical properties of the star, such as its period, inclination, and limb darkening coefficients, and on properties of the spots, such as their radius and latitude distributions. This code is the Python implementation of the interpretable GP algorithm developed in Luger, Foreman-Mackey, and Hedges (2021).


Summary
The starry_process code implements an interpretable Gaussian process (GP) for modeling variability in stellar light curves. As dark starspots rotate in and out of view, the total flux received from a distant star will change over time. Unresolved flux time series therefore encode information about the spatial structure of features on the stellar surface. The starry_process software package allows one to easily model the flux variability due to starspots, whether one is interested in understanding the properties of these spots or marginalizing over the stellar variability when it is treated as a nuisance signal. The main difference between the GP implemented here and typical GPs used to model stellar variability is the explicit dependence of our GP on physical properties of the star, such as its period, inclination, and limb darkening coefficients, and on properties of the spots, such as their radius and latitude distributions (see Figure 1). This code is the Python implementation of the interpretable GP algorithm developed in Luger, Foreman-Mackey, & Hedges (2021).

Statement of need
Mapping the surfaces of stars using time series measurements is a fundamental problem in modern time-domain stellar astrophysics. This inverse problem is ill-posed and computationally intractable, but in the associated AAS Journals publication submitted in parallel to this paper , we derive an interpretable effective Gaussian Process (GP) model for this problem that enables robust probabilistic characterization of stellar surfaces using photometric time series observations. Our model builds on previous work by Perger et al. (2020) on semi-interpretable Gaussian processes for stellar timeseries data and by Morris (2020) on approximate inference for large ensembles of stellar light curves. Implementation of our model requires the efficient evaluation of a set of special functions and recursion relations that are not readily available in existing probabilistic programming frameworks. The starry_process package provides the necessary elements to perform this analysis with existing and forthcoming astronomical datasets.

Implementation
We implement our interpretable GP in the user-friendly Python package starry_process, which can be installed via pip or from source on GitHub. The code is thoroughly unit-tested and well documented, with examples on how to use the GP in custom inference problems. As discussed in the associated AAS Journals publication , users can choose, among other options, whether or not to marginalize over the stellar inclination and whether or not to model a normalized process. Users can also choose the spherical harmonic degree of the expansion, although it is recommended to use l max = 15 (see below). Users may compute the mean vector and covariance matrix in either the spherical harmonic basis or the flux basis, or they may sample from it or use it to compute marginal likelihoods. Arbitrary order limb darkening is implemented following Agol et al. (2020).
The code was designed to maximize the speed and numerical stability of the computation. Although the computation of the GP covariance involves many layers of nested sums over spherical harmonic coefficients, these may be expressed as high-dimensional tensor products, which can be evaluated efficiently on modern hardware. Many of the expressions can also be either pre-computed or computed recursively. To maximize the speed of the algorithm, the code is implemented in hybrid C++/Python using the just-in-time compilation capability of the Theano package (Theano Development Team, 2016). Since all equations derived here have closed form expressions, these can be autodifferentiated in a straightforward and numerically stable manner, enabling the computation of backpropagated gradients within Theano. As such, starry_process is designed to work out-of-the box with Theano-based inference tools such as PyMC3 for NUTS/HMC or ADVI sampling (Salvatier et al., 2016). Figure 2 shows the computational scaling of the Python implementation of the algorithm for the case where we condition the GP on a specific value of the inclination (blue) and the case where we marginalize over inclination (orange). Both curves show the time in seconds to compute the likelihood (averaged over many trials to obtain a robust estimate) as a function of the number of points K in a single light curve. For K 100, the computation time is constant at 10 − 30 ms for both algorithms. This is the approximate time (on a typical modern laptop) taken to compute the GP covariance matrix given a set of hyperparameters θ θ θ • . For larger values of K, the cost approaches a scaling of K 2.6 , which is dominated by the factorization of the covariance matrix and the solve operation to compute the likelihood. The likelihood marginalized over inclination is only slightly slower to compute, thanks to the tricks discussed in Luger, Foreman-Mackey, & Hedges (2021).  Many modern GP packages (e.g., Ambikasaran et al., 2015;Foreman-Mackey et al., 2017) have significantly better asymptotic scalings, but these are usually due to specific structure imposed on the kernel functions, such as the assumption of stationarity. Our kernel structure is determined by the physics (or perhaps more accurately, the geometry) of stellar surfaces, and its nonstationarity is a consequence of the normalization step in relative photometry (Luger, Foreman-Mackey, Hedges, & Hogg, 2021). Moreover, and unlike the typical kernels used for GP regression, our kernel is a nontrivial function of the hyperparameters θ θ θ • , so its computation is necessarily more expensive. Nevertheless, the fact that our GP may be used for likelihood evaluation in a small fraction of a second for typical datasets (K ∼ 1, 000) makes it extremely useful for inference.
Our algorithm is also numerically stable over nearly all of the prior volume up to l max = 15. Figure 3 shows the log of the condition number of the covariance matrix in the spherical harmonic basis as a function of the spherical harmonic degree of the expansion for 100 draws from a uniform prior over the domain of the hyperparameters. The condition number is nearly constant up to l max = 15 in almost all cases; above this value, the algorithm suddenly becomes unstable and the covariance is ill-conditioned. The instability occurs within the computation of the latitude and longitude moment integrals and is likely due to the large number of operations involving linear combinations of hypergeometric and gamma functions. While it may be possible to achieve stability at higher values of l max via careful reparametrization of some of those equations, we find that l max = 15 is high enough for most practical purposes.