HOW TO REDUCE DIMENSIONALITY OF DATA: ROBUSTNESS POINT OF VIEW
Data analysis in management applications often requires to handle data with a large number of variables. Therefore, dimensionality reduction represents a common and important step in the analysis of multivariate data by methods of both statistics and data mining. This paper gives an overview of r...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
University in Belgrade
2015-04-01
|
Series: | Serbian Journal of Management |
Subjects: | |
Online Access: | http://www.sjm06.com/SJM%20ISSN1452-4864/10_1_2015_May_1-140/10_1_2015_131_140.pdf |
id |
doaj-5ef516a85318478ba1d77e87f67f4ae7 |
---|---|
record_format |
Article |
spelling |
doaj-5ef516a85318478ba1d77e87f67f4ae72020-11-24T23:05:44ZengUniversity in BelgradeSerbian Journal of Management1452-48642217-71592015-04-0110113114010.5937/sjm10-6531 HOW TO REDUCE DIMENSIONALITY OF DATA: ROBUSTNESS POINT OF VIEWJan Kalina0Dita Rensová1Institute of Computer Science of the Academy of Sciences of the Czech Republic, Praha , Czech RepublicInstitute of Computer Science of the Academy of Sciences of the Czech Republic, Praha , Czech RepublicData analysis in management applications often requires to handle data with a large number of variables. Therefore, dimensionality reduction represents a common and important step in the analysis of multivariate data by methods of both statistics and data mining. This paper gives an overview of robust dimensionality procedures, which are resistant against the presence of outlying measurements. A simulation study represents the main contribution of the paper. It compares various standard and robust dimensionality procedures in combination with standard and robust methods of classification analysis. While standard methods turn out not to perform too badly on data which are only slightly contaminated by outliers, we give practical recommendations concerning the choice of a suitable robust dimensionality reduction method for highly contaminated data. Namely the highly robust principal component analysis based on the projection pursuit approach turns out to yield the most satisfactory results over four different simulation studies. At the same time, we give recommendations on the choice of a suitable robust classification method.http://www.sjm06.com/SJM%20ISSN1452-4864/10_1_2015_May_1-140/10_1_2015_131_140.pdfdata analysisdimensionality reductionrobust statisticsprincipal component analysisrobust classification analysis |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Jan Kalina Dita Rensová |
spellingShingle |
Jan Kalina Dita Rensová HOW TO REDUCE DIMENSIONALITY OF DATA: ROBUSTNESS POINT OF VIEW Serbian Journal of Management data analysis dimensionality reduction robust statistics principal component analysis robust classification analysis |
author_facet |
Jan Kalina Dita Rensová |
author_sort |
Jan Kalina |
title |
HOW TO REDUCE DIMENSIONALITY OF DATA: ROBUSTNESS POINT OF VIEW |
title_short |
HOW TO REDUCE DIMENSIONALITY OF DATA: ROBUSTNESS POINT OF VIEW |
title_full |
HOW TO REDUCE DIMENSIONALITY OF DATA: ROBUSTNESS POINT OF VIEW |
title_fullStr |
HOW TO REDUCE DIMENSIONALITY OF DATA: ROBUSTNESS POINT OF VIEW |
title_full_unstemmed |
HOW TO REDUCE DIMENSIONALITY OF DATA: ROBUSTNESS POINT OF VIEW |
title_sort |
how to reduce dimensionality of data: robustness point of view |
publisher |
University in Belgrade |
series |
Serbian Journal of Management |
issn |
1452-4864 2217-7159 |
publishDate |
2015-04-01 |
description |
Data analysis in management applications often requires to handle data with a large number of
variables. Therefore, dimensionality reduction represents a common and important step in the
analysis of multivariate data by methods of both statistics and data mining. This paper gives an
overview of robust dimensionality procedures, which are resistant against the presence of outlying
measurements. A simulation study represents the main contribution of the paper. It compares various
standard and robust dimensionality procedures in combination with standard and robust methods of
classification analysis. While standard methods turn out not to perform too badly on data which are
only slightly contaminated by outliers, we give practical recommendations concerning the choice of
a suitable robust dimensionality reduction method for highly contaminated data. Namely the highly
robust principal component analysis based on the projection pursuit approach turns out to yield the
most satisfactory results over four different simulation studies. At the same time, we give
recommendations on the choice of a suitable robust classification method. |
topic |
data analysis dimensionality reduction robust statistics principal component analysis robust classification analysis |
url |
http://www.sjm06.com/SJM%20ISSN1452-4864/10_1_2015_May_1-140/10_1_2015_131_140.pdf |
work_keys_str_mv |
AT jankalina howtoreducedimensionalityofdatarobustnesspointofview AT ditarensova howtoreducedimensionalityofdatarobustnesspointofview |
_version_ |
1725625947791556608 |