Quasi-Monte Carlo Methods in Python

NumPy random number generators and SciPy distributions are widely used to generate random numbers. However, challenges might arise when sampling in high dimensions. Quasi-Monte Carlo (QMC) methods provide an answer to these problems but are arguably hard to use. Thanks to new developments in SciPy, a new submodule was introduced in version 1.7.0 making state-of-the-art QMC methods available: scipy.stats.qmc.


Statement of need
NumPy pseudorandom number generators (numpy.random) have become the de facto standard for sampling random numbers in the scientific Python ecosystem. These methods are fast and reliable, and the results are repeatable when a seed is provided. However, sampling in high dimensions with pseudorandom numbers tends to produce gaps and clusters of points. When these random numbers are used in algorithms (including sampling, numerical integration, optimization) to solve deterministic problems, the resulting "Monte Carlo" (MC) methods have a low convergence rate. In practice, this can mean that substantial computational resources are required to provide sufficient accuracy.
In Quasi-Monte Carlo (QMC) methods (Niederreiter, 1992), the random numbers of Monte Carlo methods are replaced with a deterministic sequence of numbers that possesses many of the characteristics of a random sequence (e.g. reduction of variance with increasing sample size), but without these gaps and clusters. QMC determinism is independent of its implementation, language, and platform -the sequence is mathematically defined.
In many cases, a QMC sequence can be used as a drop-in replacement for a random number sequence, yet they are proven to provide faster convergence rates (both in theory and practice) (Owen, 2019). When true stochasticity is required (e.g. statistical inference), QMC sequences can be "scrambled" using random numbers, and several smaller scrambled QMC sequences can often replace one large random sequence.
QMC methods were added to SciPy (Virtanen et al., 2020) after an extensive review and discussion period (Roy et al., 2021) that lead to a very fruitful collaboration between SciPy's maintainers and renowned researchers in the field. For instance, our implementation inspired additional work on the importance of including the first point in the Sobol' sequence (Owen, 2020).
Before the release of SciPy 1.7.0, the need for these functions was partially met in the scientific Python ecosystem by tutorials (e.g. blog posts) and niche packages, but the functions in SciPy have several advantages: • Popularity: With millions of downloads per month, SciPy is one of the most downloaded scientific Python packages. New features immediately reach a wide range of users from all fields. • Performance: The low-level functions are written in compiled languages such as Cython and optimized for speed and efficiency. • Consistency: The APIs comply with the high standards of SciPy, function API reference and tutorials are thorough, and the interfaces share common features complementing other SciPy functions. • Quality: As with all SciPy code, these functions were rigorously peer-reviewed for code quality and are extensively unit-tested. In addition, the implementations were produced in collaboration with the foremost experts in the QMC field.
Since the first release of all these new features, we have seen other libraries add support for and rely on SciPy's implementations, e.g. Optuna (Ishikawa et al., 2022) and SALib (Roy & Iwanaga, 2022