TY - JOUR

T1 - Worked-out examples of the adequacy of Bayesian optional stopping

AU - Tendeiro, Jorge N

AU - Kiers, Henk A L

AU - van Ravenzwaaij, Don

N1 - © 2021. The Psychonomic Society, Inc.

PY - 2022/2

Y1 - 2022/2

N2 - The practice of sequentially testing a null hypothesis as data are collected until the null hypothesis is rejected is known as optional stopping. It is well known that optional stopping is problematic in the context of p value-based null hypothesis significance testing: The false-positive rates quickly overcome the single test's significance level. However, the state of affairs under null hypothesis Bayesian testing, where p values are replaced by Bayes factors, has perhaps surprisingly been much less consensual. Rouder (2014) used simulations to defend the use of optional stopping under null hypothesis Bayesian testing. The idea behind these simulations is closely related to the idea of sampling from prior predictive distributions. Deng et al. (2016) and Hendriksen et al. (2020) have provided mathematical evidence to the effect that optional stopping under null hypothesis Bayesian testing does hold under some conditions. These papers are, however, exceedingly technical for most researchers in the applied social sciences. In this paper, we provide some mathematical derivations concerning Rouder's approximate simulation results for the two Bayesian hypothesis tests that he considered. The key idea is to consider the probability distribution of the Bayes factor, which is regarded as being a random variable across repeated sampling. This paper therefore offers an intuitive perspective to the literature and we believe it is a valid contribution towards understanding the practice of optional stopping in the context of Bayesian hypothesis testing.

AB - The practice of sequentially testing a null hypothesis as data are collected until the null hypothesis is rejected is known as optional stopping. It is well known that optional stopping is problematic in the context of p value-based null hypothesis significance testing: The false-positive rates quickly overcome the single test's significance level. However, the state of affairs under null hypothesis Bayesian testing, where p values are replaced by Bayes factors, has perhaps surprisingly been much less consensual. Rouder (2014) used simulations to defend the use of optional stopping under null hypothesis Bayesian testing. The idea behind these simulations is closely related to the idea of sampling from prior predictive distributions. Deng et al. (2016) and Hendriksen et al. (2020) have provided mathematical evidence to the effect that optional stopping under null hypothesis Bayesian testing does hold under some conditions. These papers are, however, exceedingly technical for most researchers in the applied social sciences. In this paper, we provide some mathematical derivations concerning Rouder's approximate simulation results for the two Bayesian hypothesis tests that he considered. The key idea is to consider the probability distribution of the Bayes factor, which is regarded as being a random variable across repeated sampling. This paper therefore offers an intuitive perspective to the literature and we believe it is a valid contribution towards understanding the practice of optional stopping in the context of Bayesian hypothesis testing.

U2 - 10.3758/s13423-021-01962-5

DO - 10.3758/s13423-021-01962-5

M3 - Review article

C2 - 34254263

VL - 29

SP - 70

EP - 87

JO - Psychonomic Bulletin & Review

JF - Psychonomic Bulletin & Review

SN - 1069-9384

ER -