mitolina: MITOchondrial LINeage Analysis

Mitochondria are of interest for example in forensic genetics as they are very resistant to degradation and it is sometimes possible to construct a mitochondrial DNA profile when it is not possible to construct a traditional DNA profile (Andersen & Balding, 2018; Butler, 2009). This happens for example when the biological sample does not contain cell nuclei or when the sample is degraded (for example by time or environment).


Summary
This R (R Core Team, 2018) package, mitolina (MITOchondrial LINeage Analysis), contains functionality to simulate and analyse populations of mitochondrial genomes (mitogenomes). This is achieved using both R and C++ via Rcpp (Eddelbuettel & Balamuta, 2017) for efficient computations.
Mitochondria are of interest for example in forensic genetics as they are very resistant to degradation and it is sometimes possible to construct a mitochondrial DNA profile when it is not possible to construct a traditional DNA profile (Andersen & Balding, 2018;Butler, 2009). This happens for example when the biological sample does not contain cell nuclei or when the sample is degraded (for example by time or environment).
Just as DNA profiles based on the Y chromosome are paternal lineage markers (inherited from fathers to boys) (Andersen & Balding, 2017;Butler, 2009), then DNA profiles based on the mitogenome are maternal lineage markers (inherited from mothers to children) (Andersen & Balding, 2018;Butler, 2009). This software operates under the maternal inheritance only model, i.e. that mtDNA is only passed on by mothers to children. It is often helpful to perform simulations of populations in lineage marker research as recent research on using Y chromosomal DNA profiles in forensic genetics demonstrate (Andersen, 2018;Andersen & Balding, 2017;Andersen, Caliebe, Jochens, Willuweit, & Krawczak, 2013;Andersen, Eriksen, & Morling, 2013). This R package, mitolina, is based on the R package malan (Andersen, 2018) that simulates populations of Y chromosomes. The packages are funamentally different in two aspects caused by the way that paternal and maternal lineage markers behave genetically. For example, with the mitogenome it is necessary to simulate both females and males (as males have their mother's mitogenome), at least in the generations where the profiles must be used. Also, the genetic DNA profiles used in forensic genetics are different for the two types of lineage markers. A mitogenomic DNA profile can be seen as a profile of many thousands binary markers whereas a Y-profile consists of 10-20 integer valued markers.
The simulation model allows for flexible simulations by first simulating a genealogy (with population sizes at each generation specified by a vector for number of females and a vector for number of males) with various parameters such as variance in reproductive success (Andersen & Balding, 2017). It is possible to impose mitogenomes in various ways and several sets of mutation rates are included (Översti et al., 2017;Rieux et al., 2014;Soares et al., 2009). There are also 588 forensic-quality haplotypes representing three U.S. populations from (Just et al., 2015) included; they can for example be used to distribute founder haplotypes. with a certain mitogenome. Or obtain the distribution of number of meioses between a queried contributor and the individuals in the population with a matching mitogenome.
The documentation of mitolina consists of manual pages for the various available functions, an article describing how to use the package (vignette), and unit tests.
Research using this software in interpretation of DNA profiles based on the mitogenome in forensic genetics is already published (Andersen & Balding, 2018) and the aim is that this software can help further research in this important topic.
I would like to thank David J Balding for helpful discussions.