Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)

Whether in meta-analysis or single experiments, selecting results based on statistical significance leads to overestimated effect sizes, impeding falsification. We critique a quantitative synthesis that used significance to score and select previously published effects for situation awareness-perfor...

Full description

Bibliographic Details
Main Authors:	Jonathan Z. Bakdash, Laura R. Marusich, Jared B. Kenworthy, Elyssa Twedt, Erin G. Zaroukian
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2020-12-01
Series:	Frontiers in Psychology
Subjects:	significance filter selection bias p-hacking meta-analysis confirmation bias situation awareness
Online Access:	https://www.frontiersin.org/articles/10.3389/fpsyg.2020.609647/full

id	doaj-44aefb1c9a234d31adec15d3c2af071d
record_format	Article
spelling	doaj-44aefb1c9a234d31adec15d3c2af071d2020-12-22T07:26:27ZengFrontiers Media S.A.Frontiers in Psychology1664-10782020-12-011110.3389/fpsyg.2020.609647609647Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)Jonathan Z. Bakdash0Jonathan Z. Bakdash1Laura R. Marusich2Jared B. Kenworthy3Elyssa Twedt4Erin G. Zaroukian5United States Army Combat Capabilities Development Command, Army Research Laboratory South at the University of Texas at Dallas, Richardson, TX, United StatesDepartment of Psychology and Special Education, Texas A&M University–Commerce, Commerce, TX, United StatesUnited States Army Combat Capabilities Development Command, Army Research Laboratory South at the University of Texas at Arlington, Arlington, TX, United StatesDepartment of Psychology, University of Texas at Arlington, Arlington, TX, United StatesDepartment of Psychology, St. Lawrence University, Canton, NY, United StatesUnited States Army Combat Capabilities Development Command, Army Research Laboratory, Computational and Information Sciences Directorate, Aberdeen, MD, United StatesWhether in meta-analysis or single experiments, selecting results based on statistical significance leads to overestimated effect sizes, impeding falsification. We critique a quantitative synthesis that used significance to score and select previously published effects for situation awareness-performance associations (Endsley, 2019). How much does selection using statistical significance quantitatively impact results in a meta-analytic context? We evaluate and compare results using significance-filtered effects versus analyses with all effects as-reported. Endsley reported high predictiveness scores and large positive mean correlations but used atypical methods: the hypothesis was used to select papers and effects. Papers were assigned the maximum predictiveness scores if they contained at-least-one significant effect, yet most papers reported multiple effects, and the number of non-significant effects did not impact the score. Thus, the predictiveness score was rarely less than the maximum. In addition, only significant effects were included in Endsley’s quantitative synthesis. Filtering excluded half of all reported effects, with guaranteed minimum effect sizes based on sample size. Results for filtered compared to as-reported effects clearly diverged. Compared to the mean of as-reported effects, the filtered mean was overestimated by 56%. Furthermore, 92% (or 222 out of 241) of the as-reported effects were below the mean of filtered effects. We conclude that outcome-dependent selection of effects is circular, predetermining results and running contrary to the purpose of meta-analysis. Instead of using significance to score and filter effects, meta-analyses should follow established research practices.https://www.frontiersin.org/articles/10.3389/fpsyg.2020.609647/fullsignificance filterselection biasp-hackingmeta-analysisconfirmation biassituation awareness
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Jonathan Z. Bakdash Jonathan Z. Bakdash Laura R. Marusich Jared B. Kenworthy Elyssa Twedt Erin G. Zaroukian
spellingShingle	Jonathan Z. Bakdash Jonathan Z. Bakdash Laura R. Marusich Jared B. Kenworthy Elyssa Twedt Erin G. Zaroukian Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019) Frontiers in Psychology significance filter selection bias p-hacking meta-analysis confirmation bias situation awareness
author_facet	Jonathan Z. Bakdash Jonathan Z. Bakdash Laura R. Marusich Jared B. Kenworthy Elyssa Twedt Erin G. Zaroukian
author_sort	Jonathan Z. Bakdash
title	Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)
title_short	Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)
title_full	Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)
title_fullStr	Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)
title_full_unstemmed	Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)
title_sort	statistical significance filtering overestimates effects and impedes falsification: a critique of endsley (2019)
publisher	Frontiers Media S.A.
series	Frontiers in Psychology
issn	1664-1078
publishDate	2020-12-01
description	Whether in meta-analysis or single experiments, selecting results based on statistical significance leads to overestimated effect sizes, impeding falsification. We critique a quantitative synthesis that used significance to score and select previously published effects for situation awareness-performance associations (Endsley, 2019). How much does selection using statistical significance quantitatively impact results in a meta-analytic context? We evaluate and compare results using significance-filtered effects versus analyses with all effects as-reported. Endsley reported high predictiveness scores and large positive mean correlations but used atypical methods: the hypothesis was used to select papers and effects. Papers were assigned the maximum predictiveness scores if they contained at-least-one significant effect, yet most papers reported multiple effects, and the number of non-significant effects did not impact the score. Thus, the predictiveness score was rarely less than the maximum. In addition, only significant effects were included in Endsley’s quantitative synthesis. Filtering excluded half of all reported effects, with guaranteed minimum effect sizes based on sample size. Results for filtered compared to as-reported effects clearly diverged. Compared to the mean of as-reported effects, the filtered mean was overestimated by 56%. Furthermore, 92% (or 222 out of 241) of the as-reported effects were below the mean of filtered effects. We conclude that outcome-dependent selection of effects is circular, predetermining results and running contrary to the purpose of meta-analysis. Instead of using significance to score and filter effects, meta-analyses should follow established research practices.
topic	significance filter selection bias p-hacking meta-analysis confirmation bias situation awareness
url	https://www.frontiersin.org/articles/10.3389/fpsyg.2020.609647/full
work_keys_str_mv	AT jonathanzbakdash statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019 AT jonathanzbakdash statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019 AT laurarmarusich statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019 AT jaredbkenworthy statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019 AT elyssatwedt statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019 AT eringzaroukian statisticalsignificancefilteringoverestimateseffectsandimpedesfalsificationacritiqueofendsley2019
_version_	1724374332754362368

Statistical Significance Filtering Overestimates Effects and Impedes Falsification: A Critique of Endsley (2019)

Similar Items