Sensitivity of Mixed-Source Statistics to Classification Errors

For policymakers and other users of official statistics, it is crucial to distinguish real differences underlying statistical outcomes from noise caused by various error sources in the statistical process. This has become more difficult as official statistics are increasingly based upon a mix of sou...

Full description

Bibliographic Details
Main Authors: Burger Joep, Delden Arnout van, Scholtus Sander
Format: Article
Language:English
Published: Sciendo 2015-09-01
Series:Journal of Official Statistics
Subjects:
Online Access:https://doi.org/10.1515/jos-2015-0029
id doaj-705c36c163304c4286264850c328033c
record_format Article
spelling doaj-705c36c163304c4286264850c328033c2021-09-06T19:40:51ZengSciendoJournal of Official Statistics2001-73672015-09-0131348950610.1515/jos-2015-0029jos-2015-0029Sensitivity of Mixed-Source Statistics to Classification ErrorsBurger Joep0Delden Arnout van1Scholtus Sander2Statistics Netherlands, Department of Process Development and Methodology, CBS-weg 11, P.O. Box 4481, 6401 CZ Heerlen, The NetherlandsStatistics Netherlands, Department of Process Development and Methodology, Henri Faasdreef 312, P.O. Box 24500, 2490 HA The Hague, The NetherlandsStatistics Netherlands, Department of Process Development and Methodology, Henri Faasdreef 312, P.O. Box 24500, 2490 HA The Hague, The NetherlandsFor policymakers and other users of official statistics, it is crucial to distinguish real differences underlying statistical outcomes from noise caused by various error sources in the statistical process. This has become more difficult as official statistics are increasingly based upon a mix of sources that typically do not involve probability sampling. In this article, we apply a resampling method to assess the sensitivity of mixed-source statistics to sourcespecific classification errors. Classification errors can be seen as coverage errors within a stratum. The method can be used to compare relative accuracies between strata and releases, it can assist in deciding how to optimally allocate resources in the statistical process, and it can be applied in evaluating potential estimators. A case study on short-term business statistics shows that bias occurs especially for those strata that deviate strongly from the mean value in other strata. It also suggests that shifting classification resources from small and mediumsized enterprises to large enterprises has virtually no net effect on accuracy, because the gain in precision is offset by the creation of bias. The resampling method can be extended to include other types of nonsampling error.https://doi.org/10.1515/jos-2015-0029accuracycoverage erroradministrative datashort-term business statisticsbootstrapresampling
collection DOAJ
language English
format Article
sources DOAJ
author Burger Joep
Delden Arnout van
Scholtus Sander
spellingShingle Burger Joep
Delden Arnout van
Scholtus Sander
Sensitivity of Mixed-Source Statistics to Classification Errors
Journal of Official Statistics
accuracy
coverage error
administrative data
short-term business statistics
bootstrap
resampling
author_facet Burger Joep
Delden Arnout van
Scholtus Sander
author_sort Burger Joep
title Sensitivity of Mixed-Source Statistics to Classification Errors
title_short Sensitivity of Mixed-Source Statistics to Classification Errors
title_full Sensitivity of Mixed-Source Statistics to Classification Errors
title_fullStr Sensitivity of Mixed-Source Statistics to Classification Errors
title_full_unstemmed Sensitivity of Mixed-Source Statistics to Classification Errors
title_sort sensitivity of mixed-source statistics to classification errors
publisher Sciendo
series Journal of Official Statistics
issn 2001-7367
publishDate 2015-09-01
description For policymakers and other users of official statistics, it is crucial to distinguish real differences underlying statistical outcomes from noise caused by various error sources in the statistical process. This has become more difficult as official statistics are increasingly based upon a mix of sources that typically do not involve probability sampling. In this article, we apply a resampling method to assess the sensitivity of mixed-source statistics to sourcespecific classification errors. Classification errors can be seen as coverage errors within a stratum. The method can be used to compare relative accuracies between strata and releases, it can assist in deciding how to optimally allocate resources in the statistical process, and it can be applied in evaluating potential estimators. A case study on short-term business statistics shows that bias occurs especially for those strata that deviate strongly from the mean value in other strata. It also suggests that shifting classification resources from small and mediumsized enterprises to large enterprises has virtually no net effect on accuracy, because the gain in precision is offset by the creation of bias. The resampling method can be extended to include other types of nonsampling error.
topic accuracy
coverage error
administrative data
short-term business statistics
bootstrap
resampling
url https://doi.org/10.1515/jos-2015-0029
work_keys_str_mv AT burgerjoep sensitivityofmixedsourcestatisticstoclassificationerrors
AT deldenarnoutvan sensitivityofmixedsourcestatisticstoclassificationerrors
AT scholtussander sensitivityofmixedsourcestatisticstoclassificationerrors
_version_ 1717767670360702976