Eureka!: An End-to-End Pipeline for JWST Time-Series Observations

$\texttt{Eureka!}$ is a data reduction and analysis pipeline for exoplanet time-series observations, with a particular focus on JWST data. Over the next 1-2 decades, JWST will pursue four main science themes: Early Universe, Galaxies Over Time, Star Lifecycle, and Other Worlds. Our focus is on providing the astronomy community with an open source tool for the reduction and analysis of time-series observations of exoplanets in pursuit of the fourth of these themes, Other Worlds. The goal of $\texttt{Eureka!}$ is to provide an end-to-end pipeline that starts with uncalibrated FITS files and ultimately yields precise exoplanet spectra. The pipeline has a modular structure with six stages, and each stage uses a"Eureka! Control File"(ECF) to allow for easy control of the pipeline's behavior. Stage 5 also uses a"Eureka! Parameter File"(EPF) to control the fitted parameters. We provide template ECFs for the MIRI, NIRCam, NIRISS, and NIRSpec instruments on JWST and the WFC3 instrument on the Hubble Space Telescope (HST). These templates give users a good starting point for their analyses, but $\texttt{Eureka!}$ is not intended to be used as a black box tool, and users should expect to fine-tune some settings for each observation in order to achieve optimal results. At each stage, the pipeline creates intermediate figures and outputs that allow users to compare $\texttt{Eureka!}$'s performance using different parameter settings or to compare $\texttt{Eureka!}$ with an independent pipeline. The ECF used to run each stage is also copied into the output folder from each stage to enhance reproducibility. Finally, while $\texttt{Eureka!}$ has been optimized for exoplanet observations (especially the latter stages of the code), much of the core functionality could also be repurposed for JWST time-series observations in other research domains thanks to $\texttt{Eureka!}$'s modularity.


Summary
Eureka! is a data reduction and analysis pipeline for exoplanet time-series observations, with a particular focus on James Webb Space Telescope (JWST, Gardner et al., 2006) data. JWST was launched on December 25, 2021 and over the next 1-2 decades will pursue four main science themes: Early Universe, Galaxies Over Time, Star Lifecycle, and Other Worlds. Our focus is on providing the astronomy community with an open source tool for the reduction and analysis of time-series observations of exoplanets in pursuit of the fourth of these themes, Other Worlds. The goal of Eureka! is to provide an end-to-end pipeline that starts with raw, uncalibrated FITS files and ultimately yields precise exoplanet transmission and/or emission spectra. The pipeline has a modular structure with six stages, and each stage uses a "Eureka! Control File" (ECF; these files use the .ecf file extension) to allow for easy control of the pipeline's behavior. Stage 5 also uses a "Eureka! Parameter File" (EPF; these files use the .epf file extension) to control the fitted parameters. We provide template ECFs for the MIRI (Rieke et al., 2015), NIRCam (Horner & Rieke, 2004), NIRISS (Maszkiewicz, 2017), and NIRSpec (Bagnasco et al., 2007) instruments on JWST and the WFC3 instrument (Kimble et al., 2008) on the Hubble Space Telescope (HST, Bahcall, 1986). These templates give users a good starting point for their analyses, but Eureka! is not intended to be used as a black box tool, and users should expect to fine-tune some settings for each observation in order to achieve optimal results. At each stage, the pipeline creates intermediate figures and outputs that allow users to compare Eureka!'s performance using different parameter settings or to compare Eureka! with an independent pipeline. The ECF used to run each stage is also copied into the output folder from each stage to enhance reproducibility. Finally, while Eureka! has been optimized for exoplanet observations (especially the latter stages of the code), much of the core functionality could also be repurposed for JWST time-series observations in other research domains thanks to Eureka!'s modularity.

Outline of Eureka!'s Stages
Eureka! is broken down into six stages, which are as follows (also summarized in Figure 1): • Stage 1: An optional step that calibrates raw data (converts ramps to slopes for JWST observations). This step can be skipped within Eureka! if you would rather use the Stage 1 outputs from the jwst pipeline (Bushouse et al., 2022). • Stage 2: An optional step that further calibrates Stage 1 data (performs flat-fielding, unit conversion, etc. for JWST observations). This step can be skipped within Eureka! if you would rather use the Stage 2 outputs from the jwst pipeline. • Stage 3: Using Stage 2 outputs, performs background subtraction and optimal spectral extraction. For spectroscopic observations, this stage generates a time series of 1D spectra. For photometric observations, this stage generates a single light curve of flux versus time. • Stage 4: Using Stage 3 outputs, generates spectroscopic light curves by binning the time series of 1D spectra along the wavelength axis. Optionally removes drift/jitter along the dispersion direction and/or sigma clips outliers. • Stage 5: Fits the light curves with noise and astrophysical models using different optimization or sampling algorithms. • Stage 6: Displays the planet spectrum in figure and table form using results from the Stage 5 fits.

Differences From the jwst Pipeline
Eureka's Stage 1 offers a few alternative, experimental ramp fitting methods compared to the jwst pipeline, but mostly acts as a wrapper to allow you to call the jwst pipeline in the same format as Eureka!. Similarly, Eureka!'s Stage 2 acts solely as a wrapper for the jwst pipeline. Meanwhile, Eureka!'s Stages 3 through 6 completely depart from the jwst pipeline and offer specialized background subtraction, source extraction, wavelength binning, sigma clipping, fitting, and plotting routines with heritage from past space-based exoplanet science.

Statement of Need
The calibration, reduction, and fitting of exoplanet time-series observations is a challenging problem with many tunable parameters across many stages, many of which will significantly impact the final results. Typically, the default calibration pipeline from astronomical observatories is insufficiently tailored for exoplanet time-series observations as the pipeline is more optimized for other science use cases. As such, it is common practice to develop a custom data analysis pipeline that starts from the original, uncalibrated images. Historically, data analysis pipelines have often been proprietary, so each new user of an instrument or telescope has had to develop their own pipeline. Also, clearly specifying the analysis procedure can be challenging, especially with proprietary code, which erodes reproducibility. Eureka! seeks to be a next-generation data analysis pipeline for next-generation observations from JWST with open-source and well-documented code for easier adoption; modular code for easier customization while maintaining a consistent framework; and easy-to-use but powerful inputs and outputs for increased automation, increased reproducibility, and more thorough intercomparisons. By also allowing for analyses of HST observations within the same framework, users will be able to combine new and old observations to develop a more complete understanding of individual targets or even entire populations.

Similar Tools
We will now discuss the broader data reduction and fitting ecosystem in which Eureka! lives. Several similar open-source tools are discussed below to provide additional context, but this is not meant to be a comprehensive list.
As mentioned above, Eureka! makes use of the first two stages of jwst (Bushouse et al., 2022) while offering significantly different extraction routines and novel spectral binning and fitting routines beyond what is contained in jwst. Eureka! bears similarities to the POET (Cubillos et al., 2013;Stevenson et al., 2012) and WFC3  pipelines, developed for Spitzer/IRAC and HST/WFC3 observations respectively; in fact, much of the code from those pipelines has been incorporated into Eureka!. Eureka! is near feature parity with WFC3, but the Spitzer specific parts of the POET pipeline have not been encorporated into Eureka!. The SPCA (Bell et al., 2021;Dang et al., 2018) pipeline developed for the reduction and fitting of Spitzer/IRAC observations also bears some similarity to this pipeline, and some snippets of that pipeline have also been encorporated into Eureka!. The tshirt (Schlawin & Glidic, 2022) package also offers spectral and photometric extraction routines that work for HST and JWST data. PACMAN Zieba & Kreidberg, 2022) is another open-source end-to-end pipeline developed for HST/WFC3 observations. The exoplanet (Foreman-Mackey et al., 2021) and juliet (Espinoza et al., 2019) packages offer some similar capabilities as the observation fitting parts of Eureka!.
Momcheva for useful discussions. Support for this work was provided in part by NASA through a grant from the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS 5-03127. In addition, we would like to thank the Transiting Exoplanet Community Early Release Science program for organizing meetings that contributed to the writing of Eureka!.