Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)
Whether in meta-analysis or single experiments, selecting results based on statistical significance leads to overestimated effect sizes, impeding falsification. We critique a quantitative synthesis that used significance to score and select previously published effects for situation awareness-perfor...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2020-12-01
|
Series: | Frontiers in Psychology |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fpsyg.2020.609647/full |
id |
doaj-44aefb1c9a234d31adec15d3c2af071d |
---|---|
record_format |
Article |
spelling |
doaj-44aefb1c9a234d31adec15d3c2af071d2020-12-22T07:26:27ZengFrontiers Media S.A.Frontiers in Psychology1664-10782020-12-011110.3389/fpsyg.2020.609647609647Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)Jonathan Z. Bakdash0Jonathan Z. Bakdash1Laura R. Marusich2Jared B. Kenworthy3Elyssa Twedt4Erin G. Zaroukian5United States Army Combat Capabilities Development Command, Army Research Laboratory South at the University of Texas at Dallas, Richardson, TX, United StatesDepartment of Psychology and Special Education, Texas A&M University–Commerce, Commerce, TX, United StatesUnited States Army Combat Capabilities Development Command, Army Research Laboratory South at the University of Texas at Arlington, Arlington, TX, United StatesDepartment of Psychology, University of Texas at Arlington, Arlington, TX, United StatesDepartment of Psychology, St. Lawrence University, Canton, NY, United StatesUnited States Army Combat Capabilities Development Command, Army Research Laboratory, Computational and Information Sciences Directorate, Aberdeen, MD, United StatesWhether in meta-analysis or single experiments, selecting results based on statistical significance leads to overestimated effect sizes, impeding falsification. We critique a quantitative synthesis that used significance to score and select previously published effects for situation awareness-performance associations (Endsley, 2019). How much does selection using statistical significance quantitatively impact results in a meta-analytic context? We evaluate and compare results using significance-filtered effects versus analyses with all effects as-reported. Endsley reported high predictiveness scores and large positive mean correlations but used atypical methods: the hypothesis was used to select papers and effects. Papers were assigned the maximum predictiveness scores if they contained at-least-one significant effect, yet most papers reported multiple effects, and the number of non-significant effects did not impact the score. Thus, the predictiveness score was rarely less than the maximum. In addition, only significant effects were included in Endsley’s quantitative synthesis. Filtering excluded half of all reported effects, with guaranteed minimum effect sizes based on sample size. Results for filtered compared to as-reported effects clearly diverged. Compared to the mean of as-reported effects, the filtered mean was overestimated by 56%. Furthermore, 92% (or 222 out of 241) of the as-reported effects were below the mean of filtered effects. We conclude that outcome-dependent selection of effects is circular, predetermining results and running contrary to the purpose of meta-analysis. Instead of using significance to score and filter effects, meta-analyses should follow established research practices.https://www.frontiersin.org/articles/10.3389/fpsyg.2020.609647/fullsignificance filterselection biasp-hackingmeta-analysisconfirmation biassituation awareness |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Jonathan Z. Bakdash Jonathan Z. Bakdash Laura R. Marusich Jared B. Kenworthy Elyssa Twedt Erin G. Zaroukian |
spellingShingle |
Jonathan Z. Bakdash Jonathan Z. Bakdash Laura R. Marusich Jared B. Kenworthy Elyssa Twedt Erin G. Zaroukian Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019) Frontiers in Psychology significance filter selection bias p-hacking meta-analysis confirmation bias situation awareness |
author_facet |
Jonathan Z. Bakdash Jonathan Z. Bakdash Laura R. Marusich Jared B. Kenworthy Elyssa Twedt Erin G. Zaroukian |
author_sort |
Jonathan Z. Bakdash |
title |
Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019) |
title_short |
Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019) |
title_full |
Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019) |
title_fullStr |
Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019) |
title_full_unstemmed |
Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019) |
title_sort |
statistical significance filtering overestimates effects and impedes falsification: a critique of endsley (2019) |
publisher |
Frontiers Media S.A. |
series |
Frontiers in Psychology |
issn |
1664-1078 |
publishDate |
2020-12-01 |
description |
Whether in meta-analysis or single experiments, selecting results based on statistical significance leads to overestimated effect sizes, impeding falsification. We critique a quantitative synthesis that used significance to score and select previously published effects for situation awareness-performance associations (Endsley, 2019). How much does selection using statistical significance quantitatively impact results in a meta-analytic context? We evaluate and compare results using significance-filtered effects versus analyses with all effects as-reported. Endsley reported high predictiveness scores and large positive mean correlations but used atypical methods: the hypothesis was used to select papers and effects. Papers were assigned the maximum predictiveness scores if they contained at-least-one significant effect, yet most papers reported multiple effects, and the number of non-significant effects did not impact the score. Thus, the predictiveness score was rarely less than the maximum. In addition, only significant effects were included in Endsley’s quantitative synthesis. Filtering excluded half of all reported effects, with guaranteed minimum effect sizes based on sample size. Results for filtered compared to as-reported effects clearly diverged. Compared to the mean of as-reported effects, the filtered mean was overestimated by 56%. Furthermore, 92% (or 222 out of 241) of the as-reported effects were below the mean of filtered effects. We conclude that outcome-dependent selection of effects is circular, predetermining results and running contrary to the purpose of meta-analysis. Instead of using significance to score and filter effects, meta-analyses should follow established research practices. |
topic |
significance filter selection bias p-hacking meta-analysis confirmation bias situation awareness |
url |
https://www.frontiersin.org/articles/10.3389/fpsyg.2020.609647/full |
work_keys_str_mv |
AT jonathanzbakdash statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019 AT jonathanzbakdash statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019 AT laurarmarusich statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019 AT jaredbkenworthy statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019 AT elyssatwedt statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019 AT eringzaroukian statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019 |
_version_ |
1724374332754362368 |