SamplingStrata: An R Package for the Optimization of Strati?ed Sampling

When designing a sampling survey, usually constraints are set on the desired precision levels regarding one or more target estimates (the Ys). If a sampling frame is available, containing auxiliary information related to each unit (the Xs), it is possible to adopt a stratified sample design. For any...

Full description

Bibliographic Details
Main Author: Giulio Barcaroli
Format: Article
Language:English
Published: Foundation for Open Access Statistics 2014-11-01
Series:Journal of Statistical Software
Online Access:http://www.jstatsoft.org/index.php/jss/article/view/2192
id doaj-05464d07800f42518dd6ee8d0e5b4a6e
record_format Article
spelling doaj-05464d07800f42518dd6ee8d0e5b4a6e2020-11-24T23:55:34ZengFoundation for Open Access StatisticsJournal of Statistical Software1548-76602014-11-0161112410.18637/jss.v061.i04796SamplingStrata: An R Package for the Optimization of Strati?ed SamplingGiulio BarcaroliWhen designing a sampling survey, usually constraints are set on the desired precision levels regarding one or more target estimates (the Ys). If a sampling frame is available, containing auxiliary information related to each unit (the Xs), it is possible to adopt a stratified sample design. For any given strati?cation of the frame, in the multivariate case it is possible to solve the problem of the best allocation of units in strata, by minimizing a cost function sub ject to precision constraints (or, conversely, by maximizing the precision of the estimates under a given budget). The problem is to determine the best stratification in the frame, i.e., the one that ensures the overall minimal cost of the sample necessary to satisfy precision constraints. The Xs can be categorical or continuous; continuous ones can be transformed into categorical ones. The most detailed strati?cation is given by the Cartesian product of the Xs (the atomic strata). A way to determine the best stratification is to explore exhaustively the set of all possible partitions derivable by the set of atomic strata, evaluating each one by calculating the corresponding cost in terms of the sample required to satisfy precision constraints. This is una?ordable in practical situations, where the dimension of the space of the partitions can be very high. Another possible way is to explore the space of partitions with an algorithm that is particularly suitable in such situations: the genetic algorithm. The R package SamplingStrata, based on the use of a genetic algorithm, allows to determine the best strati?cation for a population frame, i.e., the one that ensures the minimum sample cost necessary to satisfy precision constraints, in a multivariate and multi-domain case.http://www.jstatsoft.org/index.php/jss/article/view/2192
collection DOAJ
language English
format Article
sources DOAJ
author Giulio Barcaroli
spellingShingle Giulio Barcaroli
SamplingStrata: An R Package for the Optimization of Strati?ed Sampling
Journal of Statistical Software
author_facet Giulio Barcaroli
author_sort Giulio Barcaroli
title SamplingStrata: An R Package for the Optimization of Strati?ed Sampling
title_short SamplingStrata: An R Package for the Optimization of Strati?ed Sampling
title_full SamplingStrata: An R Package for the Optimization of Strati?ed Sampling
title_fullStr SamplingStrata: An R Package for the Optimization of Strati?ed Sampling
title_full_unstemmed SamplingStrata: An R Package for the Optimization of Strati?ed Sampling
title_sort samplingstrata: an r package for the optimization of strati?ed sampling
publisher Foundation for Open Access Statistics
series Journal of Statistical Software
issn 1548-7660
publishDate 2014-11-01
description When designing a sampling survey, usually constraints are set on the desired precision levels regarding one or more target estimates (the Ys). If a sampling frame is available, containing auxiliary information related to each unit (the Xs), it is possible to adopt a stratified sample design. For any given strati?cation of the frame, in the multivariate case it is possible to solve the problem of the best allocation of units in strata, by minimizing a cost function sub ject to precision constraints (or, conversely, by maximizing the precision of the estimates under a given budget). The problem is to determine the best stratification in the frame, i.e., the one that ensures the overall minimal cost of the sample necessary to satisfy precision constraints. The Xs can be categorical or continuous; continuous ones can be transformed into categorical ones. The most detailed strati?cation is given by the Cartesian product of the Xs (the atomic strata). A way to determine the best stratification is to explore exhaustively the set of all possible partitions derivable by the set of atomic strata, evaluating each one by calculating the corresponding cost in terms of the sample required to satisfy precision constraints. This is una?ordable in practical situations, where the dimension of the space of the partitions can be very high. Another possible way is to explore the space of partitions with an algorithm that is particularly suitable in such situations: the genetic algorithm. The R package SamplingStrata, based on the use of a genetic algorithm, allows to determine the best strati?cation for a population frame, i.e., the one that ensures the minimum sample cost necessary to satisfy precision constraints, in a multivariate and multi-domain case.
url http://www.jstatsoft.org/index.php/jss/article/view/2192
work_keys_str_mv AT giuliobarcaroli samplingstrataanrpackagefortheoptimizationofstratiedsampling
_version_ 1725461824909869056