yaml2sbml: Human-readable and -writable specification of ODE models and their conversion to SBML

Ordinary differential equations (ODE) models are used throughout natural sciences to describe dynamic processes. In systems biology, ODEs are mostly stored and exchanged using the Systems Biology Markup Language (SBML), a widely adopted community standard based on XML. The Parameter Estimation table (PEtab) format extends SBML to parameter estimation problems. A large number of software tools support simulation of SBML models and parameter estimation for PEtab problems. Specifying ODE models in SBML and parameter estimation problems in PEtab provides access to these tools. However, SBML is considered to be neither human-readable nor human-writable. An easy-to-use approach to construct the SBML/PEtab models tailored to ODE models will facilitate model generation. In this contribution, we present yaml2sbml , a Python tool for converting ODE models specified in an easy-to-read and -write YAML file into SBML/PEtab. yaml2sbml comes with a format validator for the input YAML, a command-line interface (CLI) and a model editor to: 1) create an ODE model programmatically in Python that can then be saved as SBML, PEtab or YAML

In this contribution, we present yaml2sbml, a Python tool for converting ODE models specified in an easy-to-read and -write YAML file into SBML/PEtab. yaml2sbml comes with a format validator for the input YAML, a command-line interface (CLI) and a model editor to: 1) create an ODE model programmatically in Python that can then be saved as SBML, PEtab or YAML and 2) edit an ODE model previously encoded in YAML. Several examples illustrate the use of yaml2sbml on realistic problems.
Model parameters can be estimated from data by formulating a likelihood function. Therefore, the system states must be mapped to measured quantities by observable functions, and a measurement noise model must be specified. The PEtab format was recently introduced to complement SBML by tab-separated value files specifying observables, measurements, experimental conditions, and estimated parameters (Schmiester et al., 2021). Currently 9 software . yaml2sbml: Human-readable and -writable specification of ODE models and their conversion to SBML. Journal of Open Source Software, 6(61), 3215. https://doi.org/10.21105/joss.03215 toolboxes support PEtab as an input format, among them COPASI, d2d, dMod and AM-ICI/pyPESTO. The PEtab documentation gives a complete and up-to-date list of tools.
Thanks to the aforementioned tools, model simulation or parameter estimation has become a matter of a few lines of code or clicks. However, ODE model definition is often a bottleneck, since constructing an SBML model from scratch is often tedious. Therefore, various approaches to facilitate model construction from text-based input formats or in code have been presented, as libsbml (Bornstein et al., 2008), SimpleSBML (Cannistra et al., 2015), MOCCASIN (Gómez et al., 2016), Antimony (Smith et al., 2009) andScrumPy (Poolman, 2006). MOCCASIN translates MATLAB code into SBML. Other tools have a textbased input format that is centered around chemical reactions and not around ODEs directly (e.g. ScrumPy), or only offer a text-based (Antimony) or only a Python-based way of defining SBML models ( libsbml, SimpleSBML), but not both at the same time interchangeably. Neither of these tools offer PEtab support.
Here, we present a human-readable and -writeable format tailored to ODE models that is based on YAML and can be validated and translated to SBML and PEtab via the Python tool yaml2sbml and a CLI. Furthermore, yaml2sbml comes with a format validator and a Python-based model editor that allows to generate, import, extend and export a YAML model within code. Figure 1 gives an overview of the typical workflow for model generation and conversion using yaml2sbml. Figure 1: Typical workflow for model generation and conversion using yaml2sbml. The ODE is written as YAML file using any text editor, the API, or the object-oriented model editor. Both can be used interchangeably. The conversion from YAML to SBML or PEtab can be performed in Python or by the CLI.

YAML Format
Building the input format on YAML allows to parse and validate the model easily, while keeping the simplicity of a text-based format (see Figure 1). The format is organized in the blocks for different model components. For more details, we refer to the format specification.

Python Tool and Command-Line Interface
The Python tool yaml2sbml allows one to validate models specified in the YAML input format and convert them to SBML or PEtab via import yaml2sbml # format validation yaml2sbml.validate_yaml(yaml_file) # SBML conversion yaml2sbml.yaml2sbml(yaml_input_file, sbml_output_file) # PEtab conversion yaml2sbml.yaml2petab(yaml_input_file, PEtab_dir, model_name) Validation is also performed internally before model conversion. libsbml (Bornstein et al., 2008) generates and validates the resulting SBML. The validator in the PEtab library checks the resulting TSV-files during conversion to PEtab.
Alongside its Python API, yaml2sbml comes with a CLI offering the same functionality via the commands yaml2sbml, yaml2petab, and yaml2sbml_validate.
yaml2sbmls model editor allows one to generate ODE models and programmatically add, delete, or modify model components. Further, the model editor allows one to import models from YAML and export them to YAML, SBML or PEtab.

Examples
Three notebooks use the Lotka-Volterra equations (Lotka, 1920) to introduce the different aspects of yaml2sbml: The Python toolbox, the CLI and the model editor. Another notebook showcases features of the input format as time-dependent or discontinuous right-hand sides. The introductory examples are complemented by two more comprehensive examples of ODE models, which do not fit in the classical reaction network formulation for which SBML is intended.
The first application example considers the Chemical Master Equation (CME) (Gillespie, 1992), a stochastic model of (bio-)chemical processes. The Finite State Projection (FSP) truncates the infinite state space of the CME, yielding a finite-dimensional ODE (Munsky & Khammash, 2006). The example implements the FSP for a two-stage model of gene expression (Shahrezaei & Swain, 2008). yaml2sbml allows one to implement the 1000-dimensional ODE in less than 20 lines of code, by exploiting the rich problem structure.
The second application example considers a well-established ODE model of human glucoseinsulin metabolism with 22 state variables (Sorensen, 1985). The Jupyter Notebook presents an implementation of the Sorensen model in the YAML format and uses the model editor to extend the preexisting YAML model to encode a patient-specific treatment.