maxsmooth: Derivative Constrained Function Fitting

maxsmooth is an optimisation routine written in Python (supporting version ≥ 3.6) for fitting Derivative Constrained Functions (DCFs) to data. DCFs are a family of functions which have derivatives that do not cross zero in the band of interest. Two special cases of DCF are Maximally Smooth Functions (MSFs) which have derivatives with order m ≥ 2 constrained and Completely Smooth Functions (CSFs) with m ≥ 1 constrained. Alternatively, we can constrain an arbitrary set of derivatives and we refer to these models as Partially Smooth Functions. Due to their constrained nature, DCFs can produce perfectly smooth fits to data and reveal non-smooth signals of interest in the residuals.


Statement of Need
The development of maxsmooth has been motivated by the problem of foreground modelling in Global 21-cm experiments (Bowman, Rogers, Monsalve, Mozdzen, & Mahesh, 2018;de Lera Acedo, 2019;Price et al., 2018;Singh et al., 2018). Global 21-cm cosmology is the study of the spin temperature of hydrogen gas and its relative magnitude compared to the Cosmic Microwave Background during the Cosmic Dawn (CD) and Epoch of Reionisation (EoR). During the CD and EoR the first stars formed and the properties of the hydrogen gas changed as it interacted with radiation from these stars (Barkana, 2016;Furlanetto, Oh, & Briggs, 2006;Pritchard & Loeb, 2012).
Modelling and removal of the foreground is essential for detection of the Global 21-cm signal. DCFs provide a powerful alternative to unconstrained polynomials for accurately modelling the smooth synchrotron/free-free emission foreground from the Galaxy and extragalactic radio sources.
To illustrate the abilities of maxsmooth we produce mock 21-cm experiment data and model and remove the foreground using an MSF. We add to a mock foreground, ν −2.5 , a Gaussian noise with standard deviation of 0.02 K and a Gaussian signal with amplitude 0.23 K, central frequency of 100 MHz and standard deviation of 10 MHz.
The figure below shows the residuals (bottom panel, green) when fitting and removing an MSF from the data (top panel) compared to the injected signal (bottom panel, red). While the removal of the foreground does not directly recover the injected signal, rather a smooth baseline subtracted version, we see the remnant of the signal in the residuals (see Bevins et al. (2020) for more details and examples). maxsmooth is applicable to any experiment in which the signal of interest has to be separated from comparatively high magnitude smooth signals or foregrounds.

maxsmooth
DCFs can be fitted with routines such as Basin-hopping (Wales & Doye, 1997) and Nelder-Mead (Nelder & Mead, 1965) and this has been the practice for 21-cm cosmology (Sathyanarayana Rao, Subrahmanyan, Udaya Shankar, & Chluba, 2017; Singh & Subrahmanyan, 2019). However, maxsmooth employs quadratic programming via CVXOPT to rapidly and efficiently fit DCFs which are constrained such that An example DCF from the maxsmooth library is given by where x and y are the independent and dependent variables respectively and N is the order of the fit with powers from 0 − (N − 1). The library is intended be extended by future contributions from users. We find that the use of quadratic programming makes maxsmooth approximately two orders of magnitude quicker than a Basin-hopping/Nelder-Mead approach. maxsmooth rephrases the above condition such that where the ± applies to a given m. This produces a set of sign spaces with different combinations of constrained positive and negative derivatives. In each sign space the associated where we are minimising χ 2 , G(s)a is a stacked matrix of derivative evaluations and a is the matrix of parameters we are optimising for. Q and q are given by here Φ is a matrix of basis function evaluations and y is a column matrix of the dependent data.
The discrete spaces can be searched in their entirety quickly and efficiently or a sign navigating algorithm can be invoked using maxsmooth reducing the fitting time. Division of the parameter space into sign spaces allows for a more complete exploration when compared to Basinhopping/Nelder-Mead based algorithms.
The sign navigating approach uses a cascading algorithm to identify a candidate optimum s and a. The algorithm starts with a randomly generated s. Each individual sign is then flipped, from the lowest order derivative first, until the objective function decreases in value. The signs associated with the lower χ 2 value become the optimum set and the process is repeated until χ 2 stops decreasing. This is followed by a limited exploration of the neighbouring sign spaces to identify the true global minimum.

Figure 2:
The time taken to fit polynomial data following an approximate x a power law using both maxsmooth quadratic programming methods and for comparison a method based in Basin-hopping and Nelder-Mead routines. We show the results using the later method up to N = 7 after which the method begins to fail without adjustments to the routine parameters. For N = 3 − 7 we find a maximum difference of 0.04% between the optimum maxsmooth χ 2 values and the Basin-hopping results. Figure taken from Bevins et al. (2020).
Documentation for maxsmooth is available at ReadTheDocs and the code can be found on Github. The code is also pip installable (PyPI). Continuous integration is performed with Travis and CircleCi. The associated code coverage can be found at CodeCov.