Influence Analysis and Alternative Approaches for Principal Component Analysis in Symbolic Interval Data
碩士 === 國立中正大學 === 數學系統計科學研究所 === 103 === Principal component analysis (PCA) is a well known statistical procedure for dimension reduction in classical data. As the age of big data advance, classical data may be aggregated as symbolic data, which was introduced by Billard and Diday (1987). In literat...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2015
|
Online Access: | http://ndltd.ncl.edu.tw/handle/z6cw7t |
id |
ndltd-TW-103CCU00477014 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-103CCU004770142019-05-15T22:07:28Z http://ndltd.ncl.edu.tw/handle/z6cw7t Influence Analysis and Alternative Approaches for Principal Component Analysis in Symbolic Interval Data Chih-Sheng Lin 林志昇 碩士 國立中正大學 數學系統計科學研究所 103 Principal component analysis (PCA) is a well known statistical procedure for dimension reduction in classical data. As the age of big data advance, classical data may be aggregated as symbolic data, which was introduced by Billard and Diday (1987). In literature, Cazes et al. (1997) and Chouakria et al. (1998) proposed vertice PCA and center PCA, Le-Rademacher and Billard (2012) proposed symbolic covariance PCA and Ichino (2011) proposed quantile PCA for symbolic interval-valued data. In this thesis, we first investigate the performances of the forementioned four PCA approaches. However, observations that are suspicious can greatly influence the results of the analysis during the process of conducting PCA. Therefore, detection of such influential intervals becomes an indispensable task. To our knowledge, a study in the influence analysis on PCA for symbolic interval-valued data has not been explored in the literature.Thus, this becomes the emphasis in this thesis. Hampel (1974) proposed influence function that provides a useful tool for influential point diagnosis. In this thesis, we adopt Hampel’s technique and develop three types of influence functions of eigenvalue and eigenvector for symbolic interval-value data, namely empirical influence function, deleted empirical influence function and sample influence function. We illustrate these proposed methods with simulation studies and real data examples. Yu-fen Huang 黃郁芬 2015 學位論文 ; thesis 62 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立中正大學 === 數學系統計科學研究所 === 103 === Principal component analysis (PCA) is a well known statistical procedure for dimension reduction in classical data. As the age of big data advance, classical data may be aggregated as symbolic data, which was introduced by Billard and Diday (1987). In literature, Cazes et al. (1997) and Chouakria et al. (1998) proposed vertice PCA and center PCA, Le-Rademacher and Billard (2012) proposed symbolic covariance PCA and Ichino (2011) proposed quantile PCA for symbolic interval-valued data. In this thesis, we first investigate the performances of the forementioned four PCA approaches. However, observations that are suspicious can greatly influence the results of the analysis during the process of conducting PCA. Therefore, detection of such influential intervals becomes an indispensable task. To our knowledge, a study in the influence analysis on PCA for symbolic interval-valued data has not been explored in the literature.Thus, this becomes the emphasis in this thesis. Hampel (1974) proposed influence function that provides a useful tool for influential point diagnosis. In this thesis, we adopt Hampel’s technique and develop three types of influence functions of eigenvalue and eigenvector for symbolic interval-value data, namely empirical influence function, deleted empirical influence function and sample influence function. We illustrate these proposed methods with simulation studies and real data examples.
|
author2 |
Yu-fen Huang |
author_facet |
Yu-fen Huang Chih-Sheng Lin 林志昇 |
author |
Chih-Sheng Lin 林志昇 |
spellingShingle |
Chih-Sheng Lin 林志昇 Influence Analysis and Alternative Approaches for Principal Component Analysis in Symbolic Interval Data |
author_sort |
Chih-Sheng Lin |
title |
Influence Analysis and Alternative Approaches for Principal Component Analysis in Symbolic Interval Data |
title_short |
Influence Analysis and Alternative Approaches for Principal Component Analysis in Symbolic Interval Data |
title_full |
Influence Analysis and Alternative Approaches for Principal Component Analysis in Symbolic Interval Data |
title_fullStr |
Influence Analysis and Alternative Approaches for Principal Component Analysis in Symbolic Interval Data |
title_full_unstemmed |
Influence Analysis and Alternative Approaches for Principal Component Analysis in Symbolic Interval Data |
title_sort |
influence analysis and alternative approaches for principal component analysis in symbolic interval data |
publishDate |
2015 |
url |
http://ndltd.ncl.edu.tw/handle/z6cw7t |
work_keys_str_mv |
AT chihshenglin influenceanalysisandalternativeapproachesforprincipalcomponentanalysisinsymbolicintervaldata AT línzhìshēng influenceanalysisandalternativeapproachesforprincipalcomponentanalysisinsymbolicintervaldata |
_version_ |
1719124008468742144 |