bayestestR: Describing Effects and their Uncertainty, Existence and Significance within the Bayesian Framework

The Bayesian framework for statistics is quickly gaining in popularity among scientists, for reasons such as reliability and accuracy (particularly in noisy data and small samples), the possibility of incorporating prior knowledge into the analysis, and the intuitive interpretation of results (Andrews & Baguley, 2013; Etz & Vandekerckhove, 2016; Kruschke, 2010; Kruschke, Aguinis, & Joo, 2012; Wagenmakers et al., 2017). Adopting the Bayesian framework is more of a shift in the paradigm than a change in the methodology; all the common statistical procedures (t-tests, correlations, ANOVAs, regressions, etc.) can also be achieved within the Bayesian framework. One of the core difference is that in the frequentist view, the effects are fixed (but unknown) and data are random. On the other hand, instead of having single estimates of the “true effect”, the Bayesian inference process computes the probability of different effects given the observed data, resulting in a distribution of possible values for the parameters, called the posterior distribution. The bayestestR package provides tools to describe these posterior distributions.


Introduction
The Bayesian framework for statistics is quickly gaining in popularity among scientists, for reasons such as reliability and accuracy (particularly in noisy data and small samples), the possibility of incorporating prior knowledge into the analysis, and the intuitive interpretation of results (Andrews & Baguley, 2013;Etz & Vandekerckhove, 2016;Kruschke, 2010;Kruschke, Aguinis, & Joo, 2012;Wagenmakers et al., 2017).Adopting the Bayesian framework is more of a shift in the paradigm than a change in the methodology; all the common statistical procedures (t-tests, correlations, ANOVAs, regressions, etc.) can also be achieved within the Bayesian framework.One of the core difference is that in the frequentist view, the effects are fixed (but unknown) and data are random.On the other hand, instead of having single estimates of the "true effect", the Bayesian inference process computes the probability of different effects given the observed data, resulting in a distribution of possible values for the parameters, called the posterior distribution.The bayestestR package provides tools to describe these posterior distributions.
Effects in the Bayesian framework can be described by characterizing their posterior distribution.Commonly reported indices include measures of centrality (e.g., the median, mean or MAP estimate) and uncertainty (the credible interval -CI).With caution, these can be considered the counterparts to the coefficient point-estimates and confidence intervals of the frequentist framework.Additionally, bayestestR also focuses on implementing a Bayesian null-hypothesis testing framework (in a Bayesian sense, i.e., extended to general testing of "effect existence") by providing access to both established and exploratory indices of effect existence and significance (such as the ROPE, Kruschke & Liddell, 2018;the MAP-based p-value, Mills, 2018; the Bayes factor, Morey & Rouder, 2011, or the Probability of Direction -pd).
Existing R packages allow users to easily fit a large variety of models and extract and visualize the posterior draws.However, most of these packages only return a limited set of indices (e.g., point-estimates and CIs).bayestestR provides a comprehensive and consistent set of functions to analyze and describe posterior distributions generated by a variety of models objects, including popular modeling packages such as rstanarm (Goodrich, Gabry, Ali, & Brilleman, 2018), brms (Bürkner, 2017), BayesFactor (Morey & Rouder, 2018), and emmeans (Lenth, 2019), thus making it a useful tool supporting the usage and development of Bayesian statistics.The main functions are described below, and full documentation is available on the package's website.

Examples of Features
The following demonstration of functions is accompanied by figures to illustrate the conceptional ideas behind the related indices.However, bayestestR functions also include plotting capabilities via the see package (Lüdecke, Waggoner, Ben-Shachar, & Makowski, 2019).

Indices of Centrality: Point-estimates
bayestestR offers two functions to compute point-estimates from posterior distributions: ma p_estimate() and point_estimate(), the latter providing options to calculate the mean, median or MAP estimate of a posterior distribution (see Figure 1).map_estimate() is a convenient function to calculate the Maximum A Posteriori (MAP) estimate directly.
The posterior mean minimizes expected squared error, whereas the posterior median minimizes expected absolute error (i.e., the difference of estimates from true values over samples).The MAP estimate corresponds to the most probable value of a posterior distribution.

Quantifying Uncertainty: The Credible Interval (CI)
In order to measure the uncertainty associated with the estimation, bayestestR provides two functions: eti(), the Equal-Tailed Interval (ETI), and hdi(), the Highest Density Interval (HDI).Both indices (accessible via the method argument in the ci() function) can be used in the context of Bayesian posterior characterisation as Credible Interval (CI).hdi() computes the HDI of a posterior distribution, i.e., the interval that contains all points within the interval having a higher probability density than points outside the interval (see Figure 2).HDIs have a particular property: Unlike an equal-tailed interval (computed by eti()) that typically excludes 2.5% from each tail of the distribution, the HDI is not equaltailed and therefore always includes the mode(s) of posterior distributions.
By default, hdi() and eti() return the 89% intervals (ci = 0.89), deemed to be more stable than, for instance, 95% intervals.An effective sample size of at least 10.000 is recommended if 95% intervals should be computed (Kruschke, 2015).Moreover, 89 indicates the arbitrariness of interval limits -its only remarkable property is being the highest prime number that does not exceed the already unstable 95% threshold (McElreath, 2018).[0.42, 7.27]

Null-Hypothesis Significance Testing (NHST)
The Bayesian framework allows one to neatly delineate and quantify different aspects of hypothesis testing, including effect existence and significance, and different indices have been developed to describe them.

ROPE and Test for Practical Equivalence
rope() computes the proportion of the HDI (default to the 89% HDI) of a posterior distribution that lies within a Region Of Practical Equivalence (the ROPE; see Figure 3, panel  A).
Statistically, the probability of a posterior distribution being different than 0 does not make much sense (the probability of it being different from a single point being infinite).Therefore, the idea underlining ROPE is to let the user define an area around the null value enclosing values that are equivalent to the null value for practical purposes (Kruschke, 2018;Kruschke & Liddell, 2018).In the absence of user-provided values, bayestestR will automatically find an appropriate range for the ROPE using the rope_range() function.

#> # Proportion of samples inside the ROPE
The proportion of HDI lying within this "null" region can be used as an decision criterion for "null-hypothesis" testing.Such a Test for Practical Equivalence, implemented via equivalence_test(), is based on the "HDI+ROPE decision rule" (Kruschke, 2018) to check whether parameter values should be accepted or rejected against an explicitly formulated "null hypothesis" (i.e., a ROPE).If the HDI is completely outside the ROPE, the "null hypothesis" for this parameter is "rejected".p_direction() computes the Probability of Direction (pd, also known as the Maximum Probability of Effect -MPE).This index of effect existence varies between 50% and 100% and can be interpreted as the probability that a parameter (described by its posterior distribution) is strictly positive or negative (whichever is the most probable).It is mathematically defined as the proportion of the posterior distribution that is of the median's sign (see Figure 3, panel B).
p_direction(distribution_normal(100, 0.4, 0.2)) Bayes Factor bayesfactor_parameters() computes Bayes factors against the null (either a point or an interval), based on prior and posterior samples of a single parameter.This Bayes factor indicates the degree by which the mass of the posterior distribution has shifted further away from or closer to the null value(s) (relative to the prior distribution), thus indicating if the null value has become less or more likely given the observed data.
When the null is an interval, the Bayes factor is computed by comparing the prior and posterior odds of the parameter falling within or outside the null; when the null is a point, a Savage-Dickey density ratio is computed, which is also an approximation of a Bayes factor comparing the marginal likelihoods of the model against a model in which the tested parameter has been restricted to the point null (Wagenmakers, Lodewyckx, Kuriyal, & Grasman, 2010).

Visualisation and Compatibility with Models
Most of bayestestR functions can be visualised (see Figure 4) by passing them to the plot() function (the see package needs to be installed).Moreover, these functions can be directly applied to statistical models (fitted for instance with rstanarm or brms), resulting in the description of the parameters of the model.Licensing and Availability bayestestR is licensed under the GNU General Public License (v3.0), with all source code stored at GitHub (https://github.com/easystats/bayestestR),and with a corresponding issue tracker for bug reporting and feature enhancements.In the spirit of honest and open science, we encourage requests/tips for fixes, feature updates, as well as general questions and concerns via direct interaction with contributors and developers.

Figure 1 :
Figure 1: Indices of centrality of the posterior distribution that are often used as point-estimates: the mean (in green), median (in red), and MAP estimate (in blue)

Figure 3 :
Figure 3: Posterior distributions (in yellow) of effects (the x-axis) with the illustration of indices of Null-Hypothesis Significance Testing: (A) The ROPE; (B) The Probability of Direction -pd; (C) The Savage-Dickey Bayes factor; (D) The MAP-based p-value.

#Figure 4 :
Figure 4: Visualisation of the pd for the parameters of a Bayesian regression model.
If the ROPE completely covers the HDI, i.e., all most credible values of a parameter are inside the ROPE, the null hypothesis is accepted.Otherwise, whether to accept or reject the null hypothesis is undecided.