tbeptools: An R package for synthesizing estuarine data for environmental research

Many environmental programs report on the status and trends of natural resources to inform management decisions for protecting or restoring environmental condition. The National Estuary Program (NEP) in the United States is one example of a resource management institution focused on “estuaries of national significance” that provides place-based solutions to managing coastal resources. There are 28 NEPs in the United States, each with similar but location-specific programmatic goals to address environmental challenges related to water quality, alteration of hydrologic flows, invasive species, climate change, declines in fish and wildlife populations, pathogens and other contaminants, and stormwater management. A critical need of each NEP is the synthesis of data from disparate sources that can inform management response to address these environmental challenges.


Summary
Many environmental programs report on the status and trends of natural resources to inform management decisions for protecting or restoring environmental condition. The National Estuary Program (NEP) in the United States is one example of a resource management institution focused on "estuaries of national significance" that provides place-based solutions to managing coastal resources. There are 28 NEPs in the United States, each with similar but location-specific programmatic goals to address environmental challenges related to water quality, alteration of hydrologic flows, invasive species, climate change, declines in fish and wildlife populations, pathogens and other contaminants, and stormwater management. A critical need of each NEP is the synthesis of data from disparate sources that can inform management response to address these environmental challenges.
The Tampa Bay Estuary Program (TBEP) in Florida, USA is responsible for developing and implementing a place-based plan to sustain historical and future progress in the restoration of Tampa Bay (N. O'Hara, Shafer Consulting, Inc., 2017). The needs of TBEP for reporting on indicators of environmental condition are similar to other environmental organizations. Multiple local and regional partners collect data that are used for different reporting products. Without data synthesis tools that are transparent, accessible, and reproducible, NEP staff and colleagues waste time and resources compiling information by hand. The tbeptools R software package can be used for routine development of reporting products, allowing for more efficient use of limited resources and a more effective approach to communicate research to environmental decision-makers. Functions in tbeptools also support the creation of content for interactive, online dashboards that can facilitate more informed decisions without requiring an intimate understanding of the R programming language or the methods for analysis.
The tbeptools package also addresses challenges associated with data retrieval and assessment of environmental data relative to important policy or management targets. For each environmental indicator, functions are included to import required data directly from sources, removing the need to manually obtain information prior to reporting. These functions are integrated into summary report cards that are generated automatically through continuous integration services (i.e., GitHub Actions) that free the analyst from external downloads, analysis, and copying of results that can introduce errors in reporting. Similar packages provide seamless access to database services (e.g., dataRetrieval, De Cicco et al., 2021), but few packages link these data sources directly to analysis and reporting as in tbeptools (but see wqindex, Thorley et al., 2018). Management targets and regulatory thresholds based on summary assessments of data, either direct from sources or included as supplementary data in the package, are also hard-coded into the functions.

Statement of need
The tbeptools R package was developed to automate data synthesis and analysis for many of the environmental indicators for Tampa Bay, with more general application to commonly available datasets for estuaries. The functions in the package were developed to extract methods from existing technical documents and to make them available in an open source programming environment. By making these tools available as an R package, routine assessments are now accomplished more quickly and other researchers can use the tools to develop more specific analysis pipelines.
Most of the NEPs do not have analysis software to operationalize data import, analysis, and plotting for reporting. Recently, a similar software package, peptools (Marcus , was developed for the Peconic Estuary Partnership (New York, USA) using many of the functions in tbeptools to develop reporting products for a new water quality monitoring program. This successful technology transfer demonstrates the added value of presenting these methods in an open source environment available for discovery and reuse by others. We expect other NEPs to begin using these tools as their application becomes more widespread among estuarine researchers.
Beyond the NEPs, tbeptools is an effective example of an R package for implementing technical methods in existing literature and reports that can be used to support environmental monitoring and assessment needs for science-based decisions. To this end, the tbeptools package was also created to support the development of online dashboards created in R Shiny (Chang et al., 2021). Dashboards are powerful tools to increase accessibility for end users to engage with scientific products without the need to understand technical details in their creation. However, providing the underlying methods as source code in an R package increases transparency and reproducibility of reporting products if users require a more detailed understanding of how the content was created. Currently, the tbeptools package supports dashboards created by TBEP for the assessment of water quality (Figure 1, M. W. Beck, 2020a), seagrasses (M. W. Beck, 2020c), nekton communities (M. W. Beck, 2020b), and tidal creeks (M. W. Beck & Wessel, 2020). Resource management agencies or similar institutions could follow this approach to facilitate development of front-end products for more informed decision-making.

Figure 1:
The TBEP water quality dashboard, demonstrating use of the tbeptools R package to generate summary plots for specific bay segments.

Example usage
The function names were chosen with a typical analysis workflow in mind, where functions are available to read data from a source (typically from an online repository or stable URL), anlz to analyze the imported data using methods in existing technical documents or published papers, and to show the results as a summary graphic for use by environmental managers. The functions are used to report on water quality (M. Beck et al., 2021), fisheries (Schrandt et al., 2021), benthic condition (D.J. Karlen, T. Dix, B.K. Goetting, S.E. Markham, K.Campbell, J. Jernigan, J.Christian, K. Martinez, A. Chacour, 2020), tidal creeks (Wessel et al., 2021), and seagrass transect data (Sherwood et al., 2017). The vignettes for the package are topically organized to describe the functions that apply to each of the indicators.
The following example demonstrates use of a subset of the functions for water quality data to read a file from the Hillsborough County Environmental Protection Commission longterm monitoring dataset (available from https://www.tampabay.wateratlas.usf.edu/), analyze monthly and annual averages by major bay segments of Tampa Bay, and plot an annual time series for one of the bay segments.  ', yrrng = c(1975, 2020))