How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?

Secondary analyses of survey data collected from large probability samples of persons or establishments further scientific progress in many fields. The complex design features of these samples improve data collection efficiency, but also require analysts to account for these features when conducting...

Full description

Bibliographic Details
Main Authors: Brady T West, Joseph W Sakshaug, Guy Alain S Aurelien
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2016-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4927119?pdf=render
id doaj-15b9c8b2b974455299a60f18e4a51f8e
record_format Article
spelling doaj-15b9c8b2b974455299a60f18e4a51f8e2020-11-25T02:43:08ZengPublic Library of Science (PLoS)PLoS ONE1932-62032016-01-01116e015812010.1371/journal.pone.0158120How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?Brady T WestJoseph W SakshaugGuy Alain S AurelienSecondary analyses of survey data collected from large probability samples of persons or establishments further scientific progress in many fields. The complex design features of these samples improve data collection efficiency, but also require analysts to account for these features when conducting analysis. Unfortunately, many secondary analysts from fields outside of statistics, biostatistics, and survey methodology do not have adequate training in this area, and as a result may apply incorrect statistical methods when analyzing these survey data sets. This in turn could lead to the publication of incorrect inferences based on the survey data that effectively negate the resources dedicated to these surveys. In this article, we build on the results of a preliminary meta-analysis of 100 peer-reviewed journal articles presenting analyses of data from a variety of national health surveys, which suggested that analytic errors may be extremely prevalent in these types of investigations. We first perform a meta-analysis of a stratified random sample of 145 additional research products analyzing survey data from the Scientists and Engineers Statistical Data System (SESTAT), which describes features of the U.S. Science and Engineering workforce, and examine trends in the prevalence of analytic error across the decades used to stratify the sample. We once again find that analytic errors appear to be quite prevalent in these studies. Next, we present several example analyses of real SESTAT data, and demonstrate that a failure to perform these analyses correctly can result in substantially biased estimates with standard errors that do not adequately reflect complex sample design features. Collectively, the results of this investigation suggest that reviewers of this type of research need to pay much closer attention to the analytic methods employed by researchers attempting to publish or present secondary analyses of survey data.http://europepmc.org/articles/PMC4927119?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Brady T West
Joseph W Sakshaug
Guy Alain S Aurelien
spellingShingle Brady T West
Joseph W Sakshaug
Guy Alain S Aurelien
How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?
PLoS ONE
author_facet Brady T West
Joseph W Sakshaug
Guy Alain S Aurelien
author_sort Brady T West
title How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?
title_short How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?
title_full How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?
title_fullStr How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?
title_full_unstemmed How Big of a Problem is Analytic Error in Secondary Analyses of Survey Data?
title_sort how big of a problem is analytic error in secondary analyses of survey data?
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2016-01-01
description Secondary analyses of survey data collected from large probability samples of persons or establishments further scientific progress in many fields. The complex design features of these samples improve data collection efficiency, but also require analysts to account for these features when conducting analysis. Unfortunately, many secondary analysts from fields outside of statistics, biostatistics, and survey methodology do not have adequate training in this area, and as a result may apply incorrect statistical methods when analyzing these survey data sets. This in turn could lead to the publication of incorrect inferences based on the survey data that effectively negate the resources dedicated to these surveys. In this article, we build on the results of a preliminary meta-analysis of 100 peer-reviewed journal articles presenting analyses of data from a variety of national health surveys, which suggested that analytic errors may be extremely prevalent in these types of investigations. We first perform a meta-analysis of a stratified random sample of 145 additional research products analyzing survey data from the Scientists and Engineers Statistical Data System (SESTAT), which describes features of the U.S. Science and Engineering workforce, and examine trends in the prevalence of analytic error across the decades used to stratify the sample. We once again find that analytic errors appear to be quite prevalent in these studies. Next, we present several example analyses of real SESTAT data, and demonstrate that a failure to perform these analyses correctly can result in substantially biased estimates with standard errors that do not adequately reflect complex sample design features. Collectively, the results of this investigation suggest that reviewers of this type of research need to pay much closer attention to the analytic methods employed by researchers attempting to publish or present secondary analyses of survey data.
url http://europepmc.org/articles/PMC4927119?pdf=render
work_keys_str_mv AT bradytwest howbigofaproblemisanalyticerrorinsecondaryanalysesofsurveydata
AT josephwsakshaug howbigofaproblemisanalyticerrorinsecondaryanalysesofsurveydata
AT guyalainsaurelien howbigofaproblemisanalyticerrorinsecondaryanalysesofsurveydata
_version_ 1724771262736105472