The MarINvaders Toolkit

Ecosystem pressures from invasive species are considered the most difficult to reverse (MEA, 2005) and are expected to increase in the nearand mid-term future (Seebens et al., 2021). Of particular concern are alien species which become established and out-compete local species on a large scale, thus becoming an invasive species. The Northern Pacific sea-star (Asterias amurensis) was, for example, introduced to Australia and Tasmania around the 1990’s and has since become a major threat to endangered species in the Sea around Australia, as well as disrupting Australian aquaculture (GISD, 2021). Global research efforts to estimate the native distribution and alien introduction of marine species are spread over several databases. Principally, in combination these databases can be used to assess the native/alien status of a certain species or all species present in a marine ecoregion (Spalding et al., 2007) although the databases provide information with varying levels of resolution. The MarINvaders Toolkit cross-references these databases and harmonizes the retrieved species distribution and status information. This allows the user to assess the alien and native distribution of marine species across regions on an individual species or ecoregion level.


Summary
The introduction and establishment of alien (non-native) species to foreign ecosystems is a key threat for marine biodiversity (Katsanevakis et al., 2014;Molnar et al., 2008;Seebens et al., 2017).
Ecosystem pressures from invasive species are considered the most difficult to reverse (MEA, 2005) and are expected to increase in the near-and mid-term future (Seebens et al., 2021). Of particular concern are alien species which become established and out-compete local species on a large scale, thus becoming an invasive species. The Northern Pacific sea-star (Asterias amurensis) was, for example, introduced to Australia and Tasmania around the 1990's and has since become a major threat to endangered species in the Sea around Australia, as well as disrupting Australian aquaculture (GISD, 2021). Global research efforts to estimate the native distribution and alien introduction of marine species are spread over several databases. Principally, in combination these databases can be used to assess the native/alien status of a certain species or all species present in a marine ecoregion (Spalding et al., 2007) although the databases provide information with varying levels of resolution. The MarINvaders Toolkit cross-references these databases and harmonizes the retrieved species distribution and status information. This allows the user to assess the alien and native distribution of marine species across regions on an individual species or ecoregion level.

Statement of need
The largest databases for gathering information on marine species distributions are: • The Ocean Biodiversity Information System (OBIS, 2020), which provides data on marine taxa and species distribution. It lacks information on the native range and alien range of species and how a specific species is affected by aliens. • The World Register of Marine Species (Horton et al., 2021) contains information on native and alien species distributions. • NatCon (Molnar et al., 2008) contains information on over 330 marine invasive species, including non-native distributions by marine ecoregion, invasion pathways, and ecological impact and other threat scores.
Additionally, the International Union for Conservation of Nature (IUCN) provides data regarding invasive species through: • The Global Invasive Species Database (GISD, 2021). This is a freely accessible, online searchable source of information about alien and invasive species that negatively impact biodiversity.
• The IUCN Red List ("The IUCN Red List of Threatened Species," 2020). The Red List can be queried manually for information about which species are affected by invasives in their natural habitat.
The main challenge for cross-referencing these data sources is the varying geographic scale in which alien and native species distributions are reported. Although most of the databases provide an API access for programmatically retrieving species distribution data, there is to date, to our knowledge, no Open Source software package available that automatically collects species data from all of these databases and harmonizes the distribution data across the data sources. MarINvaders aims to close this method gap by providing a high-level interface to assess the native and alien distribution of marine species on an individual level as well as an ecoregion level.

Functionality
MarINvaders consists of a Python 3 module that queries the open access databases listed above for species data (sightings, threat levels and alien/native status) and also includes copies of the databases which can not be queried online. When requesting information on a specific marine ecoregion, the OBIS API (v3) is used to query all species for which there is occurrence data within that ecoregion in the OBIS database. Each species is then searched for in the other databases to potentially identify them as alien.
For WoRMS and OBIS data MarINvaders uses the API calls to request information on a specific species or region. The NatCon database is included in the repository and provided through the installation of the package. IUCN data (GISD and Red List) are not allowed to be redistributed and also can not be queried automatically. We therefore made this data optional for the use of MarINvaders and give a detailed description on how to obtain this data in the documentation (https://marinvaders.gitlab.io/marinvaders/iucn_data/). Although this data is not essential for using MarINvaders we recommend to add it as it provides additional data on alien ranges (GISD) and allows to assess which species are affected by aliens (IUCN Red list).
The databases provide geographical distributions on different scales. The NatCon distributions are on a marine ecoregion level. Most of the WoRMS distributions are either IHO Sea Areas, Exclusive Economic Zones (EEZ), or a combination of these, and have a Marine Regions Geographic Identifier (MRGID) which is matched to a marine ecoregion by the use of shape-files. GISD does not provide such MRGID's but instead gives quantitative distributions such as country names. Most of these could still be matched to existing shape-files by matching country/region names, and subsequently be matched to marine ecoregions. All the distributions that could not automatically be matched were searched for manually and matched to one or more marine ecoregions. The outcome of the manual matching is included in the source code ('marinvaders/data/GISD_and_WoRMS_qualitative_distributions_linked_to_MEOWs.xlsx').
The results of a query through MarINvaders are various (Geo)pandas DataFrames which can readily be used for subsequent analysis. In addition, MarINvaders provides several summary statistics providing an overview of the alien/native species within an ecoregion as well as the global distribution of a specific region separated into native and alien ranges.

Outlook
The MarINvaders toolkit is part of a larger effort within the ERC ATLANTIS project (https: //atlantis-erc.eu/) which assesses the impact of human activity on marine ecosystem. Mar-INvaders will play a central role in upcoming case-studies and in the development of a webplatform for assessing marine environmental impacts of human activity.