Raphtory: The temporal graph engine for Rust and Python

and


SUMMARY
Raphtory is a platform for building and analysing temporal networks.The library includes methods for creating networks from a variety of data sources; algorithms to explore their structure and evolution; and an extensible GraphQL server for deployment of applications built on top.Raphtory's core engine is built in Rust, for efficiency, with Python interfaces, for ease of use.Raphtory is developed by network scientists, with a background in Physics, Applied Mathematics, Engineering and Computer Science, for use across academia and industry.

STATEMENT OF NEED
Networks are at the core of data science solutions in a range of domains, including computer science, computational social science, and the life sciences [1].Networks are a powerful language focusing on the connectivity of systems, and offer a rich toolbox to extract greater understanding from data.Several network analysis tools exist, including NetworkX [2], graph-tool [3] and igraph [4], and are freely accessible to scientists, practitioners and data miners.
However, with abundant cheap storage and tools for logging every event which occurs in an ecosystem, datasets have become increasingly rich, combining different types of information that cannot be incorporated in a standard network model [5].In particular, the temporal nature of many complex systems has led to the emergence of the field of temporal networks, with its own models and algorithms [6,7].
Unfortunately, despite active academic research in the last decade, no efficient, generalised and productionready system has been developed to explore the temporal dimension of networks.To support practitioners who wish to exploit both the structure and dynamics of their data, we have developed Raphtory.

RELATED SOFTWARE
Besides the aforementioned packages, few open access tools have been developed for the mining of temporal networks, with the existing solutions focusing on specific sub-problems within the space.Those which have attempted to generalise to all temporal network analysis are either actively under development, but too preliminary to use in production, or have been abandoned due to lack of funding or changing research goals.
As examples of these three categories: Pathpy is a Python package for the analysis of time series data on networks, but focuses on extracting and analysing timerespecting paths [8].Similarly DyNetX [9], a pure python library relying on networkX, focuses on temporal slicing and the computation of time-respecting paths.The recently released Reticula offers a range of methods developed in C++ with a Python interface [10].Phasik [11], written in Python, focuses on inferring phases from temporal network data.EvolvingGraphs.jl[12], Recall-Graph [13] and Chronograph [14] all saw significant work before development was halted indefinitely.
Raphtory is a valuable addition to this ecosystem for the following reasons.Originally developed in Scala [15], its current core is entirely written in Rust.This is to ensure fast and memory-efficient computation that a pure python implementation could not achieve, and to handle the sheer volume of temporal network data, which often dwarfs that of an equivalent static network.
The library provides an expressive Python interface for interoperability with other data science tools, as well as simpler and more maintainable code.In addition, the library is built with a focus on scalability, as it relies on efficient data structures that can be used to extract different views of large temporal graphs.This avoids the creation of multiple graph objects that is not feasible with large datasets.The use of these new features is supported by well-documented APIs and tutorials, guiding the user from data loading through to analysis.

OVERVIEW
The core Raphtory model consists of a base temporal graph which maintains a chronological log of all changes to its structure and property values over time.A graph can be created using simple functions for adding/removing vertices and edges at different time points, as well as updating their properties.Alternatively, a graph can be generated through in-built loaders for common data sources/formats (fig.1a).
Once a graph has been created, a user may generate 'graph views' which set some structural or temporal constraints through which the underlying graph may be observed.Graph views can be generated programmatically over a desired time range (windows), over sets of nodes which pass some user-defined criteria (subgraphs), or over a subset of layers if the graph is multilayered.Additionally, the views can leverage event durations and support various semantics for deletions.To reduce memory footprint, graph views are only materialised upon access.This allows a user to maintain thousands of different perspectives of their graph simultaneously, which can be explored and compared through the application of graph algorithms and metrics (fig. 1b).Furthermore, Raphtory provides extensions for null model generation and exporting of views to other graph libraries such as NetworkX.
Raphtory includes fast and scalable implementations of algorithms for temporal network mining such as temporal motifs (fig.1c) and temporal reachability.In addition, it exposes its internal API for implementing algorithms in Rust, and surfacing them in Python.Finally, Raphtory is built with a focus on ease of use and can be installed using standard Python and Rust package managers.Once installed it can be integrated within an analysis pipeline or run standalone as a GraphQL service.

PROJECTS USING RAPHTORY
Raphtory has proved an invaluable resource in industrial and academic projects, for instance to characterise the time evolution of the fringe social network Gab [17], transactions of users of a dark web marketplace Alphabay using temporal motifs [16] or anomalous patterns of activity in NFT trades [18].The library has recently been significantly rewritten, and we expect that with its new functionalities, efficiency and ease of use, it will become an essential part of the network scince community.(c) Raphtory offers rapid implementations of algorithms specifically designed for temporal networks, here finding significant temporal motifs [16].Left shows the code required for extracting all three-edge up-to-three node motifs from a dataset of StackExchange interactions that complete within 1 hour (3600 seconds), right is a visualisation of these motif counts (the row is the first two edges of the motif, the column the third edge).
(a) In a temporal network, edges are dynamical entities connecting pairs of nodes.Left is code for creating an example temporal network in Raphtory from a dataset of emails, right is a visual schematic containing this information.(b) Generation of a sequence of graph views at a given time resolution and on selected layers, to run standard network algorithms, here Pagerank.Left: the all-time top five nodes by PageRank are selected, then their PageRank over a set of monthly sliding windows is calculated.Right: these values are plotted over time in a dataset of Ethereum transactions.