HOW TO REDUCE DIMENSIONALITY OF DATA: ROBUSTNESS POINT OF VIEW

Data analysis in management applications often requires to handle data with a large number of variables. Therefore, dimensionality reduction represents a common and important step in the analysis of multivariate data by methods of both statistics and data mining. This paper gives an overview of r...

Full description

Bibliographic Details
Main Authors: Jan Kalina, Dita Rensová
Format: Article
Language:English
Published: University in Belgrade 2015-04-01
Series:Serbian Journal of Management
Subjects:
Online Access:http://www.sjm06.com/SJM%20ISSN1452-4864/10_1_2015_May_1-140/10_1_2015_131_140.pdf
id doaj-5ef516a85318478ba1d77e87f67f4ae7
record_format Article
spelling doaj-5ef516a85318478ba1d77e87f67f4ae72020-11-24T23:05:44ZengUniversity in BelgradeSerbian Journal of Management1452-48642217-71592015-04-0110113114010.5937/sjm10-6531 HOW TO REDUCE DIMENSIONALITY OF DATA: ROBUSTNESS POINT OF VIEWJan Kalina0Dita Rensová1Institute of Computer Science of the Academy of Sciences of the Czech Republic, Praha , Czech RepublicInstitute of Computer Science of the Academy of Sciences of the Czech Republic, Praha , Czech RepublicData analysis in management applications often requires to handle data with a large number of variables. Therefore, dimensionality reduction represents a common and important step in the analysis of multivariate data by methods of both statistics and data mining. This paper gives an overview of robust dimensionality procedures, which are resistant against the presence of outlying measurements. A simulation study represents the main contribution of the paper. It compares various standard and robust dimensionality procedures in combination with standard and robust methods of classification analysis. While standard methods turn out not to perform too badly on data which are only slightly contaminated by outliers, we give practical recommendations concerning the choice of a suitable robust dimensionality reduction method for highly contaminated data. Namely the highly robust principal component analysis based on the projection pursuit approach turns out to yield the most satisfactory results over four different simulation studies. At the same time, we give recommendations on the choice of a suitable robust classification method.http://www.sjm06.com/SJM%20ISSN1452-4864/10_1_2015_May_1-140/10_1_2015_131_140.pdfdata analysisdimensionality reductionrobust statisticsprincipal component analysisrobust classification analysis
collection DOAJ
language English
format Article
sources DOAJ
author Jan Kalina
Dita Rensová
spellingShingle Jan Kalina
Dita Rensová
HOW TO REDUCE DIMENSIONALITY OF DATA: ROBUSTNESS POINT OF VIEW
Serbian Journal of Management
data analysis
dimensionality reduction
robust statistics
principal component analysis
robust classification analysis
author_facet Jan Kalina
Dita Rensová
author_sort Jan Kalina
title HOW TO REDUCE DIMENSIONALITY OF DATA: ROBUSTNESS POINT OF VIEW
title_short HOW TO REDUCE DIMENSIONALITY OF DATA: ROBUSTNESS POINT OF VIEW
title_full HOW TO REDUCE DIMENSIONALITY OF DATA: ROBUSTNESS POINT OF VIEW
title_fullStr HOW TO REDUCE DIMENSIONALITY OF DATA: ROBUSTNESS POINT OF VIEW
title_full_unstemmed HOW TO REDUCE DIMENSIONALITY OF DATA: ROBUSTNESS POINT OF VIEW
title_sort how to reduce dimensionality of data: robustness point of view
publisher University in Belgrade
series Serbian Journal of Management
issn 1452-4864
2217-7159
publishDate 2015-04-01
description Data analysis in management applications often requires to handle data with a large number of variables. Therefore, dimensionality reduction represents a common and important step in the analysis of multivariate data by methods of both statistics and data mining. This paper gives an overview of robust dimensionality procedures, which are resistant against the presence of outlying measurements. A simulation study represents the main contribution of the paper. It compares various standard and robust dimensionality procedures in combination with standard and robust methods of classification analysis. While standard methods turn out not to perform too badly on data which are only slightly contaminated by outliers, we give practical recommendations concerning the choice of a suitable robust dimensionality reduction method for highly contaminated data. Namely the highly robust principal component analysis based on the projection pursuit approach turns out to yield the most satisfactory results over four different simulation studies. At the same time, we give recommendations on the choice of a suitable robust classification method.
topic data analysis
dimensionality reduction
robust statistics
principal component analysis
robust classification analysis
url http://www.sjm06.com/SJM%20ISSN1452-4864/10_1_2015_May_1-140/10_1_2015_131_140.pdf
work_keys_str_mv AT jankalina howtoreducedimensionalityofdatarobustnesspointofview
AT ditarensova howtoreducedimensionalityofdatarobustnesspointofview
_version_ 1725625947791556608