BioPsyKit: A Python package for the analysis of biopsychological data

To assess the interaction between biological and mental processes a variety of different modalities are used in the field of biopsychology, such as electrophysiology, assessed, for instance, via electrocardiography (ECG), electrodermal activity (EDA), or electroencephalography (EEG), sleep, activity and movement, assessed via inertial measurement units (IMUs), neuroendocrine and inflammatory biomarkers, assessed by saliva and blood samples, as well as self-reports, assessed via psychological questionnaires.


Summary
Biopsychology is a field of psychology that analyzes how biological processes interact with behaviour, emotion, cognition, and other mental processes. Biopsychology covers, among others, the topics of sensation and perception, emotion regulation, movement (and control of such), sleep and biological rhythms, as well as acute and chronic stress.
To assess the interaction between biological and mental processes a variety of different modalities are used in the field of biopsychology, such as electrophysiology, assessed, for instance, via electrocardiography (ECG), electrodermal activity (EDA), or electroencephalography (EEG), sleep, activity and movement, assessed via inertial measurement units (IMUs), neuroendocrine and inflammatory biomarkers, assessed by saliva and blood samples, as well as self-reports, assessed via psychological questionnaires.
These different modalities are collected either "in the lab," during standardized laboratory protocols, or "in the wild," during unsupervised protocols in home environments. The collected data are typically analyzed using statistical methods, or, more recently, using machine learning methods.
While some software packages exist that allow for the analysis of single data modalities, such as electrophysiological data, or sleep, activity and movement data, no packages are available for the analysis of other modalities, such as neuroendocrine and inflammatory biomarker, and self-reports. In order to fill this gap, and, simultaneously, to combine all required tools analyzing biopsychological data from beginning to end into one single Python package, we developed BioPsyKit.

Statement of Need
Researchers in biopsychology often face the challenge of analyzing data from different assessment modalities during experiments in order to capture the complex interaction between biological and mental processes.
One example might be collecting (electro)physiological (e.g., ECG) or salivary biomarkers (e.g., cortisol) during an acute stress protocol, and investigating the correlation between biomarkers and psychometric data assessed via self-reports, such as perceived stress, state anxiety, or positive/negative affect. Another example is the assessment of relationships between sleep and neuroendocrine responses in the morning. To assess the beginning and end of sleep periods, as well as other sleep-related parameters, researchers typically use inertial measurement units (IMUs) or activity trackers. These data are then combined with psychometric data from selfreports (e.g., sleep quality, stress coping, etc.) and data from saliva samples to assess the cortisol awakening response (CAR) in the morning.
While some packages already address a subset of these different applications, such as Ne uroKit2 (Makowski et al., 2021) for the analysis of (electro)physiological data, SleepPy (Python) (Christakis et al., 2019) or GGIR (R) (Migueles et al., 2019) for sleep analysis from accelerometer data, no software package exists that unites all these different, heterogeneous data modalities under one umbrella. Furthermore, and to the best of our knowledge, no software packages exist that allow a standardized analysis of neuroendocrine biomarkers without the requirement to write analysis code from scratch. Likewise, no software packages that implement established psychological questionnaires, allowing to compute questionnaire (sub)scales from raw questionnaire items, have been published to date.
For that reason BioPsyKit addresses these limitations and offers all necessary building blocks for the analysis of biopsychological data. Our software package allows to systematically combine, process, and analyze data from different modalities using one common API. This enables researchers to write cleaner and more reproducible analysis code, to export results in a standardized format, and to create high quality figures for scientific publications.

BioPsyKit Structure
The following section describes the structure and the core modules of BioPsyKit. An overview is also provided in Figure 1.

Physiological Signal Analysis
The module biopsykit.signals can be used for the analysis of various (electro)physiological signals (ECG, EEG, Respiration, Motion, and more). This includes: • Classes to create processing pipelines for various physiological signals and for extracting relevant parameters from these signals. For physiological signal processing, BioPsyKit internally relies on the NeuroKit2 Python library (Makowski et al., 2021), but offers further functionalities (e.g., the possibility to apply different R peak outlier removal techniques for R peaks extracted from ECG data).
• Plotting functions specialized for visualizing different physiological signals.

Sleep Analysis
The module biopsykit.sleep can be used for the analysis of motion data collected during sleep. This includes: • Different algorithms for sleep/wake detection from wrist-worn activity or IMU data, such as the Cole/Kripke (Cole et al., 1992) or the Sadeh algorithm (Sadeh et al., 1994). • Computation of sleep endpoints from detected sleep and wake phases and functions for plotting sleep processing results (e.g., Figure 2). • Functions to import and process data from commercially available sleep trackers (e.g., Withings Sleep Analyzer).

Biomarker Analysis
The module biopsykit.saliva can be used for the analysis of saliva-based biomarkers, such as cortisol and alpha-amylase. This also includes the extraction of relevant parameters characterizing salivary biomarkers (e.g., area under the curve (Pruessner et al., 2003), slope, maximum increase, and more) and specialized plotting functions.

Self-Report Analysis
The module biopsykit.questionnaires can be used for the analysis of psychometric self-reports, assessed via questionnaires. This includes: • Functions to convert, clean, and impute tabular data from questionnaire studies.
• Implementation of various established psychological questionnaires, such as Perceived Stress Scale (PSS) (Cohen et al., 1983), Primary Appraisal Secondary Appraisal Scale (PASA) (Gaab et al., 2005) and functions to compute scores from questionnaire data.

Support for Psychological Protocols
The module biopsykit.protocols provides an object-oriented interface for psychological protocols. On the one hand, it serves as data structure to store and access data collected during this psychological protocol from different modalities. On the other hand, the objectoriented interface allows to conveniently compute analysis results from the data added to the particular protocol instance, to export results, and to create plots for data visualization. This includes: • Protocols for the assessment of biological rhythms, especially acute stress, in the laboratory, e.g., Trier Social Stress Test (TSST) (Kirschbaum et al., 1993) or Montreal Imaging Stress Task (MIST) (Dedovic et al., 2005). • Protocols for the assessment of biological rhythms in the wild, e.g., Cortisol Awakening Response (CAR). • Specialized plotting functions for standardized visualization of data collected during these psychological protocols (such as, heart rate data: Figure 3, saliva data: Figure 4).

Simplified Evaluation
The module biopsykit.stats and biopsykit.classification can be used for simplified evaluation of statistical analyses and machine learning pipelines that are frequently used in biopsychological research. biopsykit.stats provides functions to easily set up statistical analysis pipelines (using pingouin (Vallat, 2018)) and to visualize and export statistical analysis results in a standardized way (see, e.g., Figure 5).
biopsykit.classification provides functions to set up, optimize and evaluate different machine learning pipelines for biopsychological problems.

Availability
The software is available as a pip installable package (pip install biopsykit), as well as on GitHub at: https://github.com/mad-lab-fau/BioPsyKit.