Missing data management and statistical measurement of socio-economic status: application of big data

Abstract Socio-economic status measurement is an ongoing problem where different suggested measurements are given by researchers. This work investigates a socio-economic status measurement derived from natural correlations of variables which can better and meaningfully cluster African countries for...

Full description

Bibliographic Details
Main Author: Habtamu Tilaye Wubetie
Format: Article
Language:English
Published: SpringerOpen 2017-12-01
Series:Journal of Big Data
Subjects:
Online Access:http://link.springer.com/article/10.1186/s40537-017-0099-y
Description
Summary:Abstract Socio-economic status measurement is an ongoing problem where different suggested measurements are given by researchers. This work investigates a socio-economic status measurement derived from natural correlations of variables which can better and meaningfully cluster African countries for the level of status. The researcher used 48 African countries socio-economic yearly time series data from 1993 to 2013 of IMF 2013 data set for data management (i.e, 2737 variables for 21 years), however, the analysis is reasonably done based on recent 14 years time series data. In data management, missing values are treated (imputed) by using regression estimates, Lagrange interpolation, linear interpolation and linear spline interpolation based on the appropriate method which best fits for the trend of data with minimum error at each time level. From principal component and factor analysis of average time series data, 7 principal factors contributed by 84 variables which explain $$70\%$$ 70 % of the variation in the data set are suggested as a socio-economic status measuring components and as a result the considered clustering methods (K-mean Method, Average linkage method, Ward’s method and Bootstrap Ward’s method) are agreed on six clusters of countries, those are statistically significant at $$95\%$$ 95 % , where as three countries each where suggested as outlier-countries made an individual cluster.
ISSN:2196-1115