netrd: A library for network reconstruction and graph distances

Over the last two decades, alongside the increased availability of large network datasets, we have witnessed the rapid rise of network science. For many systems, however, the data we have access to is not a direct description of the underlying network. More and more, we see the drive to study networks that have been inferred or reconstructed from non-network data---in particular, using time series data from the nodes in a system to infer likely connections between them. Selecting the most appropriate technique for this task is a challenging problem in network science. Different reconstruction techniques usually have different assumptions, and their performance varies from system to system in the real world. One way around this problem could be to use several different reconstruction techniques and compare the resulting networks. However, network comparison is also not an easy problem, as it is not obvious how best to quantify the differences between two networks, in part because of the diversity of tools for doing so. The netrd Python package seeks to address these two parallel problems in network science by providing, to our knowledge, the most extensive collection of both network reconstruction techniques and network comparison techniques (often referred to as graph distances) in a single library (https://github.com/netsiphd/netrd). In this article, we detail the two main functionalities of the netrd package. Along the way, we describe some of its other useful features. This package builds on commonly used Python packages and is already a widely used resource for network scientists and other multidisciplinary researchers. With ongoing open-source development, we see this as a tool that will continue to be used by all sorts of researchers to come.

main functionalities of the netrd package.Along the way, we describe some of its other useful features.This package builds on commonly used Python packages (e.g.networkx [7], numpy [8], scipy [9]) and is already a widely used resource for network scientists and other multidisciplinary researchers.With ongoing open-source development, we see this as a tool that will continue to be used by all sorts of researchers to come.

Network reconstruction from time series data
Given time series data, T S, of the behavior of N nodes / components / sensors of a system over the course of L timesteps, and given the assumption that the behavior of every node, v i , may have been influenced by the past behavior of other nodes, v j , there are dozens of techniques that can be used to infer which connections, e ij , are likely to exist between the nodes.That is, we can use one of many network reconstruction techniques to create a network representation, G r , that attempts to best capture the relationships between the time series of every node in T S. netrd is a Python package that lets users perform this network reconstruction task using 17 different techniques, meaning that many different networks can be created from a single time series dataset.For example, in Figure 1 we show the outputs of 15 different reconstruction techniques applied to time series data generated from an example network [10,11,12,13,14,15,16,17,18,19,20,21].

Simulated network dynamics
Practitioners often apply these network reconstruction algorithms to real time series data.For example, in neuroscience, researchers often try to reconstruct functional networks from time series readouts of neural activity [11].In economics, researchers can infer networks of influence between companies based on time series of changes in companies' stock prices [22].At the same time, it is often quite helpful having the freedom to simulate arbitrary time series dynamics on randomly generated networks.This provides a controlled setting to assess the performance of network reconstruction algorithms.For this reason, the netrd package also includes a number of different techniques for simulating dynamics on networks.

Comparing networks using graph distances
A common goal when studying networks is to describe and quantify how different two networks are.This is a challenging problem, as there are countless axes upon which two networks can differ; as such, a number of graph distance measures have emerged over the years attempting to address this problem.As is the case for many hard problems in network science, it can be difficult to know which (of many) measures are suited for a given setting.In netrd, we consolidate over 20 different graph distance measures into a single package [23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41].Figure 2 shows an example of just how different these measures can be when comparing two networks, G 1 and G 2 .This submodule in netrd has already been used in recent work with a novel characterization of the graph distance literature [42]. 2 Related software packages In the network reconstruction literature, there are often software repositories that detail a single technique or a few related ones.For example Lizier (2014) implemented a Java package (portable to Python, octave, R, Julia, Clojure, MATLAB) that uses information-theoretic approaches for inferring network structure from time-series data [43]  includes as wide-ranging techniques as netrd (nor were they explicitly designed to).In the graph distance literature, the same trend is broadly true: many one-off software repositories exist for specific measures.However, there are some packages that do include multiple graph distances; for example, Wills (2017) created a NetComp package that includes several variants of a few distance measures included here [45].

Figure 1 :
Figure 1: Example of the network reconstruction pipeline.(Top row) A sample network, its adjacency matrix, and an example time series, T S, of node-level activity simulated on the network.(Bottom rows) The outputs of 15 different network reconstruction algorithms, each using T S to create a new adjacency matrix that captures key structural properties of the original network.

Figure 2 :
Figure 2: Example of the graph distance measures in netrd.Here, we measure the graph distance between two networks using 20 different distance measures from netrd.
[44]nge et al. (2019) created a Python package that combines linear or nonlinear conditional independence tests with a causal discovery algorithm to reconstruct causal networks from large-scale time series datasets[44].These are two examples of powerful and widely used packages though neither