GCM-Filters: A Python Package for Diffusion-based Spatial Filtering of Gridded Data

1 Department of Applied Mathematics, University of Colorado Boulder, Boulder, CO, USA 2 Lamont-Doherty Earth Observatory, Columbia University, New York, NY, USA 3 Courant Institute of Mathematical Sciences, New York University, New York, NY, USA 4 Climate and Global Dynamics Division, National Center for Atmospheric Research, Boulder, CO, USA 5 Woods Hole Oceanographic Institution, Woods Hole, MA, USA 6 Earth, Ocean and Ecological Sciences, University of Liverpool, UK 7 Research School of Earth Sciences, Australian National University, Canberra, Australia DOI: 10.21105/joss.03947


Summary
GCM-Filters is a python package that allows scientists to perform spatial filtering analysis in an easy, flexible and efficient way. The package implements the filtering method based on the discrete Laplacian operator that was introduced by Grooms et al. (2021). The filtering algorithm is analogous to smoothing via diffusion; hence the name diffusion-based filters. GCM-Filters can be used with either gridded observational data or gridded data that is produced by General Circulation Models (GCMs) of ocean, weather, and climate. Spatial filtering of observational or GCM data is a common analysis method in the Earth Sciences, for example to study oceanic and atmospheric motions at different spatial scales or to develop subgrid-scale parameterizations for ocean models.
GCM-Filters provides filters that are highly configurable, with the goal to be useful for a wide range of scientific applications. The user has different options for selecting the filter scale and filter shape. The filter scale can be defined in several ways: a fixed length scale (e.g., 100 km), a scale tied to a model grid scale (e.g., 1 • ), or a scale tied to a varying dynamical scale (e.g., the Rossby radius of deformation). As an example, Figure 1 shows unfiltered and filtered relative vorticity, where the filter scale is set to a model grid scale of 4 • . GCM-Filters also allows for anisotropic, i.e., direction-dependent, filtering. Finally, the filter shape -currently: either Gaussian or Taper -determines how sharply the filter separates scales above and below the target filter scale.

Statement of Need
Spatial filtering is commonly used as a scientific tool for analyzing gridded data. An example of an existing spatial filtering tool in python is the ndimage.gaussian_filter function in SciPy , implemented as a sequence of convolution filters. While being a valuable tool for image processing (or blurring), SciPy's Gaussian filter is of limited use for GCM data; it assumes a regular and rectangular Cartesian grid, employs a simple boundary condition, and the definitions of filter scale and shape have little or no flexibility. The python package GCM-Filters is specificially designed to filter GCM data, and seeks to solve a number of challenges for the user: 1. GCM data comes on irregular curvilinear grids with spatially varying grid-cell geometry. 2. Continental boundaries require careful and special treatment when filtering ocean GCM output. 3. Earth Science applications benefit from configurable filters, where the definition of filter scale and shape is flexible. 4. GCM output is often too large to process in memory, requiring distributed and / or delayed execution.
The GCM-Filters algorithm (Grooms et al., 2021) applies a discrete Laplacian to smooth a field through an iterative process that resembles diffusion. The discrete Laplacian takes into account the varying grid-cell geometry and uses a no-flux boundary condition, mimicking how diffusion is internally implemented in GCMs. The no-flux boundary conditions ensures that the filter preserves the integral: where f is the original field, f the filtered field, and Ω the ocean domain. Conservation of the integral is a desirable filter property for many physical quantities, for example energy or ocean salinity. More details on the filter properties can be found in Grooms et al. (2021).
An important goal of GCM-Filters is to enable computationally efficient filtering. The user can employ GCM-Filters on either CPUs or GPUs, with NumPy (Harris et al., 2020) or CuPy (Okuta et al., 2017) input data. GCM-Filters leverages Dask (Dask Development Team, 2016) and Xarray (Hoyer & Hamman, 2017) to support filtering of larger-than-memory datasets and computational flexibility.

Usage
The main GCM-Filters class that the user will interface with is the gcm_filters.Filter object. When creating a filter object, the user specifies how they want to smooth their data, including the desired filter shape and filter scale. At this stage, the user also picks the grid type that matches their GCM data, given a predefined list of grid types. Each grid type has an associated discrete Laplacian, and requires different grid variables that the user must provide (the latter are usually available to the user as part of the GCM output). Currently, GCM-Filters provides a number of different grid types and associated discrete Laplacians: • Grid types with scalar Laplacians that can be used for filtering scalar fields, for example temperature or vorticity (see Figure 1).  (Campin et al., 2021).
Atmospheric model grids are not yet supported, but could be implemented in GCM-Filters.
Users are encouraged to contribute more grid types and Laplacians via pull requests. While we are excited to share GCM-Filters at version 0.2.3, we plan to continue improving and maintaining the package for the long run and welcome new contributors from the broader community.