Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)

Whether in meta-analysis or single experiments, selecting results based on statistical significance leads to overestimated effect sizes, impeding falsification. We critique a quantitative synthesis that used significance to score and select previously published effects for situation awareness-perfor...

Full description

Bibliographic Details
Main Authors: Jonathan Z. Bakdash, Laura R. Marusich, Jared B. Kenworthy, Elyssa Twedt, Erin G. Zaroukian
Format: Article
Language:English
Published: Frontiers Media S.A. 2020-12-01
Series:Frontiers in Psychology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fpsyg.2020.609647/full
id doaj-44aefb1c9a234d31adec15d3c2af071d
record_format Article
spelling doaj-44aefb1c9a234d31adec15d3c2af071d2020-12-22T07:26:27ZengFrontiers Media S.A.Frontiers in Psychology1664-10782020-12-011110.3389/fpsyg.2020.609647609647Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)Jonathan Z. Bakdash0Jonathan Z. Bakdash1Laura R. Marusich2Jared B. Kenworthy3Elyssa Twedt4Erin G. Zaroukian5United States Army Combat Capabilities Development Command, Army Research Laboratory South at the University of Texas at Dallas, Richardson, TX, United StatesDepartment of Psychology and Special Education, Texas A&M University–Commerce, Commerce, TX, United StatesUnited States Army Combat Capabilities Development Command, Army Research Laboratory South at the University of Texas at Arlington, Arlington, TX, United StatesDepartment of Psychology, University of Texas at Arlington, Arlington, TX, United StatesDepartment of Psychology, St. Lawrence University, Canton, NY, United StatesUnited States Army Combat Capabilities Development Command, Army Research Laboratory, Computational and Information Sciences Directorate, Aberdeen, MD, United StatesWhether in meta-analysis or single experiments, selecting results based on statistical significance leads to overestimated effect sizes, impeding falsification. We critique a quantitative synthesis that used significance to score and select previously published effects for situation awareness-performance associations (Endsley, 2019). How much does selection using statistical significance quantitatively impact results in a meta-analytic context? We evaluate and compare results using significance-filtered effects versus analyses with all effects as-reported. Endsley reported high predictiveness scores and large positive mean correlations but used atypical methods: the hypothesis was used to select papers and effects. Papers were assigned the maximum predictiveness scores if they contained at-least-one significant effect, yet most papers reported multiple effects, and the number of non-significant effects did not impact the score. Thus, the predictiveness score was rarely less than the maximum. In addition, only significant effects were included in Endsley’s quantitative synthesis. Filtering excluded half of all reported effects, with guaranteed minimum effect sizes based on sample size. Results for filtered compared to as-reported effects clearly diverged. Compared to the mean of as-reported effects, the filtered mean was overestimated by 56%. Furthermore, 92% (or 222 out of 241) of the as-reported effects were below the mean of filtered effects. We conclude that outcome-dependent selection of effects is circular, predetermining results and running contrary to the purpose of meta-analysis. Instead of using significance to score and filter effects, meta-analyses should follow established research practices.https://www.frontiersin.org/articles/10.3389/fpsyg.2020.609647/fullsignificance filterselection biasp-hackingmeta-analysisconfirmation biassituation awareness
collection DOAJ
language English
format Article
sources DOAJ
author Jonathan Z. Bakdash
Jonathan Z. Bakdash
Laura R. Marusich
Jared B. Kenworthy
Elyssa Twedt
Erin G. Zaroukian
spellingShingle Jonathan Z. Bakdash
Jonathan Z. Bakdash
Laura R. Marusich
Jared B. Kenworthy
Elyssa Twedt
Erin G. Zaroukian
Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)
Frontiers in Psychology
significance filter
selection bias
p-hacking
meta-analysis
confirmation bias
situation awareness
author_facet Jonathan Z. Bakdash
Jonathan Z. Bakdash
Laura R. Marusich
Jared B. Kenworthy
Elyssa Twedt
Erin G. Zaroukian
author_sort Jonathan Z. Bakdash
title Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)
title_short Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)
title_full Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)
title_fullStr Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)
title_full_unstemmed Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)
title_sort statistical significance filtering overestimates effects and impedes falsification: a critique of endsley (2019)
publisher Frontiers Media S.A.
series Frontiers in Psychology
issn 1664-1078
publishDate 2020-12-01
description Whether in meta-analysis or single experiments, selecting results based on statistical significance leads to overestimated effect sizes, impeding falsification. We critique a quantitative synthesis that used significance to score and select previously published effects for situation awareness-performance associations (Endsley, 2019). How much does selection using statistical significance quantitatively impact results in a meta-analytic context? We evaluate and compare results using significance-filtered effects versus analyses with all effects as-reported. Endsley reported high predictiveness scores and large positive mean correlations but used atypical methods: the hypothesis was used to select papers and effects. Papers were assigned the maximum predictiveness scores if they contained at-least-one significant effect, yet most papers reported multiple effects, and the number of non-significant effects did not impact the score. Thus, the predictiveness score was rarely less than the maximum. In addition, only significant effects were included in Endsley’s quantitative synthesis. Filtering excluded half of all reported effects, with guaranteed minimum effect sizes based on sample size. Results for filtered compared to as-reported effects clearly diverged. Compared to the mean of as-reported effects, the filtered mean was overestimated by 56%. Furthermore, 92% (or 222 out of 241) of the as-reported effects were below the mean of filtered effects. We conclude that outcome-dependent selection of effects is circular, predetermining results and running contrary to the purpose of meta-analysis. Instead of using significance to score and filter effects, meta-analyses should follow established research practices.
topic significance filter
selection bias
p-hacking
meta-analysis
confirmation bias
situation awareness
url https://www.frontiersin.org/articles/10.3389/fpsyg.2020.609647/full
work_keys_str_mv AT jonathanzbakdash statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019
AT jonathanzbakdash statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019
AT laurarmarusich statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019
AT jaredbkenworthy statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019
AT elyssatwedt statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019
AT eringzaroukian statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019
_version_ 1724374332754362368