Sensitivity of Mixed-Source Statistics to Classification Errors
For policymakers and other users of official statistics, it is crucial to distinguish real differences underlying statistical outcomes from noise caused by various error sources in the statistical process. This has become more difficult as official statistics are increasingly based upon a mix of sou...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Sciendo
2015-09-01
|
Series: | Journal of Official Statistics |
Subjects: | |
Online Access: | https://doi.org/10.1515/jos-2015-0029 |
id |
doaj-705c36c163304c4286264850c328033c |
---|---|
record_format |
Article |
spelling |
doaj-705c36c163304c4286264850c328033c2021-09-06T19:40:51ZengSciendoJournal of Official Statistics2001-73672015-09-0131348950610.1515/jos-2015-0029jos-2015-0029Sensitivity of Mixed-Source Statistics to Classification ErrorsBurger Joep0Delden Arnout van1Scholtus Sander2Statistics Netherlands, Department of Process Development and Methodology, CBS-weg 11, P.O. Box 4481, 6401 CZ Heerlen, The NetherlandsStatistics Netherlands, Department of Process Development and Methodology, Henri Faasdreef 312, P.O. Box 24500, 2490 HA The Hague, The NetherlandsStatistics Netherlands, Department of Process Development and Methodology, Henri Faasdreef 312, P.O. Box 24500, 2490 HA The Hague, The NetherlandsFor policymakers and other users of official statistics, it is crucial to distinguish real differences underlying statistical outcomes from noise caused by various error sources in the statistical process. This has become more difficult as official statistics are increasingly based upon a mix of sources that typically do not involve probability sampling. In this article, we apply a resampling method to assess the sensitivity of mixed-source statistics to sourcespecific classification errors. Classification errors can be seen as coverage errors within a stratum. The method can be used to compare relative accuracies between strata and releases, it can assist in deciding how to optimally allocate resources in the statistical process, and it can be applied in evaluating potential estimators. A case study on short-term business statistics shows that bias occurs especially for those strata that deviate strongly from the mean value in other strata. It also suggests that shifting classification resources from small and mediumsized enterprises to large enterprises has virtually no net effect on accuracy, because the gain in precision is offset by the creation of bias. The resampling method can be extended to include other types of nonsampling error.https://doi.org/10.1515/jos-2015-0029accuracycoverage erroradministrative datashort-term business statisticsbootstrapresampling |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Burger Joep Delden Arnout van Scholtus Sander |
spellingShingle |
Burger Joep Delden Arnout van Scholtus Sander Sensitivity of Mixed-Source Statistics to Classification Errors Journal of Official Statistics accuracy coverage error administrative data short-term business statistics bootstrap resampling |
author_facet |
Burger Joep Delden Arnout van Scholtus Sander |
author_sort |
Burger Joep |
title |
Sensitivity of Mixed-Source Statistics to Classification Errors |
title_short |
Sensitivity of Mixed-Source Statistics to Classification Errors |
title_full |
Sensitivity of Mixed-Source Statistics to Classification Errors |
title_fullStr |
Sensitivity of Mixed-Source Statistics to Classification Errors |
title_full_unstemmed |
Sensitivity of Mixed-Source Statistics to Classification Errors |
title_sort |
sensitivity of mixed-source statistics to classification errors |
publisher |
Sciendo |
series |
Journal of Official Statistics |
issn |
2001-7367 |
publishDate |
2015-09-01 |
description |
For policymakers and other users of official statistics, it is crucial to distinguish real differences underlying statistical outcomes from noise caused by various error sources in the statistical process. This has become more difficult as official statistics are increasingly based upon a mix of sources that typically do not involve probability sampling. In this article, we apply a resampling method to assess the sensitivity of mixed-source statistics to sourcespecific classification errors. Classification errors can be seen as coverage errors within a stratum. The method can be used to compare relative accuracies between strata and releases, it can assist in deciding how to optimally allocate resources in the statistical process, and it can be applied in evaluating potential estimators. A case study on short-term business statistics shows that bias occurs especially for those strata that deviate strongly from the mean value in other strata. It also suggests that shifting classification resources from small and mediumsized enterprises to large enterprises has virtually no net effect on accuracy, because the gain in precision is offset by the creation of bias. The resampling method can be extended to include other types of nonsampling error. |
topic |
accuracy coverage error administrative data short-term business statistics bootstrap resampling |
url |
https://doi.org/10.1515/jos-2015-0029 |
work_keys_str_mv |
AT burgerjoep sensitivityofmixedsourcestatisticstoclassificationerrors AT deldenarnoutvan sensitivityofmixedsourcestatisticstoclassificationerrors AT scholtussander sensitivityofmixedsourcestatisticstoclassificationerrors |
_version_ |
1717767670360702976 |