SamplingStrata: An R Package for the Optimization of Strati?ed Sampling
When designing a sampling survey, usually constraints are set on the desired precision levels regarding one or more target estimates (the Ys). If a sampling frame is available, containing auxiliary information related to each unit (the Xs), it is possible to adopt a stratified sample design. For any...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
Foundation for Open Access Statistics
2014-11-01
|
Series: | Journal of Statistical Software |
Online Access: | http://www.jstatsoft.org/index.php/jss/article/view/2192 |
id |
doaj-05464d07800f42518dd6ee8d0e5b4a6e |
---|---|
record_format |
Article |
spelling |
doaj-05464d07800f42518dd6ee8d0e5b4a6e2020-11-24T23:55:34ZengFoundation for Open Access StatisticsJournal of Statistical Software1548-76602014-11-0161112410.18637/jss.v061.i04796SamplingStrata: An R Package for the Optimization of Strati?ed SamplingGiulio BarcaroliWhen designing a sampling survey, usually constraints are set on the desired precision levels regarding one or more target estimates (the Ys). If a sampling frame is available, containing auxiliary information related to each unit (the Xs), it is possible to adopt a stratified sample design. For any given strati?cation of the frame, in the multivariate case it is possible to solve the problem of the best allocation of units in strata, by minimizing a cost function sub ject to precision constraints (or, conversely, by maximizing the precision of the estimates under a given budget). The problem is to determine the best stratification in the frame, i.e., the one that ensures the overall minimal cost of the sample necessary to satisfy precision constraints. The Xs can be categorical or continuous; continuous ones can be transformed into categorical ones. The most detailed strati?cation is given by the Cartesian product of the Xs (the atomic strata). A way to determine the best stratification is to explore exhaustively the set of all possible partitions derivable by the set of atomic strata, evaluating each one by calculating the corresponding cost in terms of the sample required to satisfy precision constraints. This is una?ordable in practical situations, where the dimension of the space of the partitions can be very high. Another possible way is to explore the space of partitions with an algorithm that is particularly suitable in such situations: the genetic algorithm. The R package SamplingStrata, based on the use of a genetic algorithm, allows to determine the best strati?cation for a population frame, i.e., the one that ensures the minimum sample cost necessary to satisfy precision constraints, in a multivariate and multi-domain case.http://www.jstatsoft.org/index.php/jss/article/view/2192 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Giulio Barcaroli |
spellingShingle |
Giulio Barcaroli SamplingStrata: An R Package for the Optimization of Strati?ed Sampling Journal of Statistical Software |
author_facet |
Giulio Barcaroli |
author_sort |
Giulio Barcaroli |
title |
SamplingStrata: An R Package for the Optimization of Strati?ed Sampling |
title_short |
SamplingStrata: An R Package for the Optimization of Strati?ed Sampling |
title_full |
SamplingStrata: An R Package for the Optimization of Strati?ed Sampling |
title_fullStr |
SamplingStrata: An R Package for the Optimization of Strati?ed Sampling |
title_full_unstemmed |
SamplingStrata: An R Package for the Optimization of Strati?ed Sampling |
title_sort |
samplingstrata: an r package for the optimization of strati?ed sampling |
publisher |
Foundation for Open Access Statistics |
series |
Journal of Statistical Software |
issn |
1548-7660 |
publishDate |
2014-11-01 |
description |
When designing a sampling survey, usually constraints are set on the desired precision levels regarding one or more target estimates (the Ys). If a sampling frame is available, containing auxiliary information related to each unit (the Xs), it is possible to adopt a stratified sample design. For any given strati?cation of the frame, in the multivariate case it is possible to solve the problem of the best allocation of units in strata, by minimizing a cost function sub ject to precision constraints (or, conversely, by maximizing the precision of the estimates under a given budget). The problem is to determine the best stratification in the frame, i.e., the one that ensures the overall minimal cost of the sample necessary to satisfy precision constraints. The Xs can be categorical or continuous; continuous ones can be transformed into categorical ones. The most detailed strati?cation is given by the Cartesian product of the Xs (the atomic strata). A way to determine the best stratification is to explore exhaustively the set of all possible partitions derivable by the set of atomic strata, evaluating each one by calculating the corresponding cost in terms of the sample required to satisfy precision constraints. This is una?ordable in practical situations, where the dimension of the space of the partitions can be very high. Another possible way is to explore the space of partitions with an algorithm that is particularly suitable in such situations: the genetic algorithm. The R package SamplingStrata, based on the use of a genetic algorithm, allows to determine the best strati?cation for a population frame, i.e., the one that ensures the minimum sample cost necessary to satisfy precision constraints, in a multivariate and multi-domain case. |
url |
http://www.jstatsoft.org/index.php/jss/article/view/2192 |
work_keys_str_mv |
AT giuliobarcaroli samplingstrataanrpackagefortheoptimizationofstratiedsampling |
_version_ |
1725461824909869056 |