Devicely: A Python package for reading, timeshifting and writing sensor data

Wearable devices can track a multitude of parameters such as heart rate, body temperature, blood oxygen saturation, acceleration, blood glucose and many more (Kamiŝalić et al., 2018). Moreover, they are becoming increasingly popular with a steep increase in market presence in 2020 alone (IDC, 2020). Applications for wearable devices vary from tracking cardiovascular risks (Bayoumy et al., 2021) to identifying COVID-19 onset (Mishra et al., 2020). Therefore, there is a great need for scientists to easily go through data acquired from different wearables and to be able to share them while protecting user privacy. In order to solve this problem and empower scientists working with biosignals, we developed the devicely package. It processes the data into a tabular format and contains tools for data de-identification. It allows scientists to focus on what they want: the analysis of biosignals guided by privacy principles.


Related Work
The first example of a package working with wearable data is mhealthtools , which is developed in R and focuses on extracting features from sensors such as inertial measurement units (IMUs). Its main difference from devicely is firstly the language (R versus Python) and secondly their complementary nature. Mhealthtools offers functionalities for feature extraction and devicely is intended to help users in a prior step by reading and writing data from wearables into standardized formats.
There are also packages developed in Python, such as SleepPy (Christakis et al., 2019) which uses raw accelerometer data for assessing sleep quantity and quality. The HRV (Bartels & Peçanha, 2020) package uses CSV and text files or Python iterables such as lists to generate features related to heart rate variability (HRV). GaitPy (Czech & Patel, 2019) accepts input data in a customizable format and is mainly used to extract features for gait analysis. Therefore, packages such as SleepPy, HRV and GaitPy could also be used in a following step to extract features from the output generated by devicely.
FLIRT (Maritsch et al., 2020) and wearablecompute (Bent, 2020) are packages that provide ways for reading data from specific wearables such as Empatica E4. They also include functionalities to extract features from electrodermal activity (EDA), acceleration and HRV. The main difference from FLIRT and wearablecompute to devicely is the focus on feature extraction versus privacy and data sharing. FLIRT and wearablecompute read the data for extracting features, while devicely aims to provide users with a way to read the data, de-identify them as necessary and write them back in a specified format. In this way, researchers can ensure even more data privacy and use the data easily for further analysis and sharing. * corresponding author

Statement of need
Every wearable vendor has a different data format and reading them is usually a challenge for scientists. Therefore, in order for researchers to be able to use different sensor data in an easy and friendly way we developed the devicely package. The package also contains two methods to help with data de-identification, one is called timeshift and the other is a write method. The idea behind them is that researchers can timeshift all their time series data to a different time from the one the actual experiments occurred and then write this new de-identified dataset back to the original or a similar data format. This will empower scientists to maintain user privacy and hopefully share more data to increase research reproducibility.

Design
Different wearables provide incompatible data formats which require specific preprocessing steps. However, it should be easy for scientists to add data from a new wearable to an existing pipeline and easy for developers to add a new wearable to the devicely package. To achieve both, devicely encapsulates data preparation for each wearable behind three common methods: read, timeshift and write.
After reading, the data is accessible through the reader in common formats such as Pandas DataFrames. De-identification is achieved by timeshifting the data, either by providing a shifting interval or randomly. For writing back de-identified data, devicely focuses on keeping a format that can be read again using the same reader class. In almost all cases, this is the same format as the one the wearable originaly provides. This enables sharing data with the community while maintaining user privacy.

Functionalities
All reader classes support three core functions: reading data created by a wearable, timeshifting them and writing them back. To read data the corresponding reader class can be initialized using as a parameter a path to the data created by the wearable. After reading, data can be accessed through the reader in convenient formats such as dictionaries and Pandas DataFrames.
After creating a reader object the method timeshift can be applied upon it. This assures de-identification by shifting all time-related data points. To control the shifting interval, a parameter can be provided to timeshift. If no parameter is provided, the data is shifted by a random time interval to the past. The timeshifted data can be written back using the write method.
For all wearables, the written data can be read again using the same reader class. Figure 1 depicts the class structure of the devicely package and serves as a guide for future implementation.

Figure 1:
On the left side, the structure of the files in the devicely package is depicted. Each device should have a separate file. On the right side at the top we show how to import a class from a device file into __init__.py. At the right bottom side there is an example of one device class and its methods.

Availability
The software can be obtained through the Python Package Index (PyPI), Conda-forge, Zenodo and on GitHub under the MIT License.

Mention
This package was used in the following paper: