kima: Exoplanet detection in radial velocities

The radial-velocity (RV) method is one of the most successful in the detection of exoplanets, but is hindered by the intrinsic RV variations of the star, which can easily mimic or hide true planetary signals. kima is a package for the detection and characterization of exoplanets using RV data. It fits a sum of Keplerian curves to a timeseries of RV measurements and calculates the evidence for models with a fixed number Np of Keplerian signals, or after marginalising over Np. Moreover, kima can use a GP with a quasi-periodic kernel as a noise model, to deal with activity-induced signals. The hyperparameters of the GP are inferred together with the orbital parameters. The code is written in C++, but includes a helper Python package, pykima, which facilitates the analysis of the results.


Summary
The radial-velocity (RV) method is one of the most successful in the detection of exoplanets. An orbiting planet induces a gravitational pull on its host star, which is observed as a periodic variation of the velocity of the star in the direction of the line of sight. By measuring the associated wavelength shifts in stellar spectra, a RV timeseries is constructed. These data provide information about the presence of (one or more) planets and allow for the planet mass(es) and several orbital parameters to be determined (e.g. Fischer et al. 2016).
One of the main barriers to the detection of Earth-like planets with RVs is the intrinsic variations of the star, which can easily mimic or hide true RV signals of planets. Gaussian processes (GP) are now seen as a promising tool to model the correlated noise that arises from stellar-induced RV variations. (e.g. Haywood et al. 2014).
kima is a package for the detection and characterization of exoplanets using RV data. It fits a sum of Keplerian curves to a timeseries of RV measurements, using the Diffusive Nested Sampling algorithm (Brewer, Pártay, and Csányi 2011) to sample from the posterior distribution of the model parameters. This algorithm can sample the multimodal and correlated posteriors that often arise in this problem (e.g. Brewer and Donovan 2015).
Unlike similar open-source packages, kima calculates the fully marginalized likelihood, or evidence, both for a model with a fixed number N p of Keplerian signals, or after marginalising over N p . For this latter task, N p itself is a free parameter and we sample from its posterior distribution using the trans-dimensional method proposed by Brewer (2014). Because kima uses the Diffusive Nested Sampling algorithm, the evidence values are still accurate when the likelihood function contains phase changes which would make other algorithms (such as thermodynamic integration) unreliable (Skilling 2006). Moreover, kima can use a GP with a quasi-periodic kernel as a noise model, to deal with activity-induced signals. The hyperparameters of the GP are inferred together with the orbital parameters. Priors for each of the parameters can be easily set by the user, with a broad choice of standard probability distributions already implemented.
The code is written in C++, but also includes a helper Python package, pykima, which facilitates the analysis of the results. It depends on the DNest4 and the Eigen packages, which are included as submodules in the repository. Other (Python) dependencies are the numpy, scipy, matplotlib, and corner packages. Documentation can be found in the main repository, that also contains a set of examples of how use kima, serving as the package's test suite.
Initial versions of this package were used in the analysis of HARPS RV data of the active planet-host CoRoT-7 (Faria et al. 2016), in which the orbital parameters of the two exoplanets CoRoT-7b and CoRoT-7c, as well as the rotation period of the star and the typical lifetime of active regions, were inferred from RV observations alone. arXiv:1806.08305v1 [astro-ph.IM] 21 Jun 2018 Figure 1: Results from a typical analysis with kima. The top panel shows a simulated RV dataset with two injected "planets". The black curves represent ten samples from the posterior predictive distribution. On the top right, the posterior distribution for the number of planets Np clearly favours the detection of the two planets (this parameter had a uniform prior between 0 and 2). The panels in the bottom row show the posterior distributions for the orbital periods, the semi-amplitudes and the eccentricities of the two signals.