Cavalier Use of Inferential Statistics Is a Major Source of False and Irreproducible Scientific Findings
I uncover previously underappreciated systematic sources of false and irreproducible results in natural, biomedical and social sciences that are rooted in statistical methodology. They include the inevitably occurring deviations from basic assumptions behind statistical analyses and the use of vario...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-03-01
|
Series: | Mathematics |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-7390/9/6/603 |
id |
doaj-0dff556f4b9948209d685eb08d491f7f |
---|---|
record_format |
Article |
spelling |
doaj-0dff556f4b9948209d685eb08d491f7f2021-03-12T00:03:42ZengMDPI AGMathematics2227-73902021-03-01960360310.3390/math9060603Cavalier Use of Inferential Statistics Is a Major Source of False and Irreproducible Scientific FindingsLeonid Hanin0Department of Mathematics and Statistics, Idaho State University, 921 S. 8th Avenue, Stop 8085, Pocatello, ID 83209-8085, USAI uncover previously underappreciated systematic sources of false and irreproducible results in natural, biomedical and social sciences that are rooted in statistical methodology. They include the inevitably occurring deviations from basic assumptions behind statistical analyses and the use of various approximations. I show through a number of examples that (a) arbitrarily small deviations from distributional homogeneity can lead to arbitrarily large deviations in the outcomes of statistical analyses; (b) samples of random size may violate the Law of Large Numbers and thus are generally unsuitable for conventional statistical inference; (c) the same is true, in particular, when random sample size and observations are stochastically dependent; and (d) the use of the Gaussian approximation based on the Central Limit Theorem has dramatic implications for <i>p</i>-values and statistical significance essentially making pursuit of small significance levels and <i>p</i>-values for a fixed sample size meaningless. The latter is proven rigorously in the case of one-sided Z test. This article could serve as a cautionary guidance to scientists and practitioners employing statistical methods in their work.https://www.mdpi.com/2227-7390/9/6/603central limit theoremdistributional homogeneitylaw of large numbersprobability metric<i>p</i>-valuerandom sample size |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Leonid Hanin |
spellingShingle |
Leonid Hanin Cavalier Use of Inferential Statistics Is a Major Source of False and Irreproducible Scientific Findings Mathematics central limit theorem distributional homogeneity law of large numbers probability metric <i>p</i>-value random sample size |
author_facet |
Leonid Hanin |
author_sort |
Leonid Hanin |
title |
Cavalier Use of Inferential Statistics Is a Major Source of False and Irreproducible Scientific Findings |
title_short |
Cavalier Use of Inferential Statistics Is a Major Source of False and Irreproducible Scientific Findings |
title_full |
Cavalier Use of Inferential Statistics Is a Major Source of False and Irreproducible Scientific Findings |
title_fullStr |
Cavalier Use of Inferential Statistics Is a Major Source of False and Irreproducible Scientific Findings |
title_full_unstemmed |
Cavalier Use of Inferential Statistics Is a Major Source of False and Irreproducible Scientific Findings |
title_sort |
cavalier use of inferential statistics is a major source of false and irreproducible scientific findings |
publisher |
MDPI AG |
series |
Mathematics |
issn |
2227-7390 |
publishDate |
2021-03-01 |
description |
I uncover previously underappreciated systematic sources of false and irreproducible results in natural, biomedical and social sciences that are rooted in statistical methodology. They include the inevitably occurring deviations from basic assumptions behind statistical analyses and the use of various approximations. I show through a number of examples that (a) arbitrarily small deviations from distributional homogeneity can lead to arbitrarily large deviations in the outcomes of statistical analyses; (b) samples of random size may violate the Law of Large Numbers and thus are generally unsuitable for conventional statistical inference; (c) the same is true, in particular, when random sample size and observations are stochastically dependent; and (d) the use of the Gaussian approximation based on the Central Limit Theorem has dramatic implications for <i>p</i>-values and statistical significance essentially making pursuit of small significance levels and <i>p</i>-values for a fixed sample size meaningless. The latter is proven rigorously in the case of one-sided Z test. This article could serve as a cautionary guidance to scientists and practitioners employing statistical methods in their work. |
topic |
central limit theorem distributional homogeneity law of large numbers probability metric <i>p</i>-value random sample size |
url |
https://www.mdpi.com/2227-7390/9/6/603 |
work_keys_str_mv |
AT leonidhanin cavalieruseofinferentialstatisticsisamajorsourceoffalseandirreproduciblescientificfindings |
_version_ |
1724223347688996864 |