Integration and imputation of survey data in R: the StatMatch package

Statistical matching methods permit to integrate two or more data sources with the purpose of investigating the relationship between variables not jointly observed. Recently these methods received much attention as valid alternative to produce new statistical outputs. The paper provides an overview...

Full description

Bibliographic Details
Main Author: Marcello D’Orazio
Format: Article
Language:English
Published: Romanian National Institute of Statistics 2015-06-01
Series:Revista Română de Statistică
Subjects:
Online Access:http://www.revistadestatistica.ro/wp-content/uploads/2015/04/RRS2_2015_A06.pdf
Description
Summary:Statistical matching methods permit to integrate two or more data sources with the purpose of investigating the relationship between variables not jointly observed. Recently these methods received much attention as valid alternative to produce new statistical outputs. The paper provides an overview on the statistical matching methods implemented in the package StatMatch for the R environment, focusing on the most widespread methods and how they were improved. Particular attention is devoted to hot deck matching methods, strictly related to the ones developed for the imputation of missing values. The corresponding functions in StatMatch are very powerful and are flexible enough to be applied for imputing missing values in a survey. The paper tackles also the problem of matching data from complex sample surveys, a very important topic in National Statistical Institutes. Finally it is described the concept of uncertainty characterizing the statistical matching framework and how this alternative approach can be exploited for different purposes.
ISSN:1018-046X
1844-7694