Influence Analysis and Alternative Approaches for Principal Component Analysis in Symbolic Interval Data

碩士 === 國立中正大學 === 數學系統計科學研究所 === 103 === Principal component analysis (PCA) is a well known statistical procedure for dimension reduction in classical data. As the age of big data advance, classical data may be aggregated as symbolic data, which was introduced by Billard and Diday (1987). In literat...

Full description

Bibliographic Details
Main Authors: Chih-Sheng Lin, 林志昇
Other Authors: Yu-fen Huang
Format: Others
Language:en_US
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/z6cw7t
id ndltd-TW-103CCU00477014
record_format oai_dc
spelling ndltd-TW-103CCU004770142019-05-15T22:07:28Z http://ndltd.ncl.edu.tw/handle/z6cw7t Influence Analysis and Alternative Approaches for Principal Component Analysis in Symbolic Interval Data Chih-Sheng Lin 林志昇 碩士 國立中正大學 數學系統計科學研究所 103 Principal component analysis (PCA) is a well known statistical procedure for dimension reduction in classical data. As the age of big data advance, classical data may be aggregated as symbolic data, which was introduced by Billard and Diday (1987). In literature, Cazes et al. (1997) and Chouakria et al. (1998) proposed vertice PCA and center PCA, Le-Rademacher and Billard (2012) proposed symbolic covariance PCA and Ichino (2011) proposed quantile PCA for symbolic interval-valued data. In this thesis, we first investigate the performances of the forementioned four PCA approaches. However, observations that are suspicious can greatly influence the results of the analysis during the process of conducting PCA. Therefore, detection of such influential intervals becomes an indispensable task. To our knowledge, a study in the influence analysis on PCA for symbolic interval-valued data has not been explored in the literature.Thus, this becomes the emphasis in this thesis. Hampel (1974) proposed influence function that provides a useful tool for influential point diagnosis. In this thesis, we adopt Hampel’s technique and develop three types of influence functions of eigenvalue and eigenvector for symbolic interval-value data, namely empirical influence function, deleted empirical influence function and sample influence function. We illustrate these proposed methods with simulation studies and real data examples. Yu-fen Huang 黃郁芬 2015 學位論文 ; thesis 62 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立中正大學 === 數學系統計科學研究所 === 103 === Principal component analysis (PCA) is a well known statistical procedure for dimension reduction in classical data. As the age of big data advance, classical data may be aggregated as symbolic data, which was introduced by Billard and Diday (1987). In literature, Cazes et al. (1997) and Chouakria et al. (1998) proposed vertice PCA and center PCA, Le-Rademacher and Billard (2012) proposed symbolic covariance PCA and Ichino (2011) proposed quantile PCA for symbolic interval-valued data. In this thesis, we first investigate the performances of the forementioned four PCA approaches. However, observations that are suspicious can greatly influence the results of the analysis during the process of conducting PCA. Therefore, detection of such influential intervals becomes an indispensable task. To our knowledge, a study in the influence analysis on PCA for symbolic interval-valued data has not been explored in the literature.Thus, this becomes the emphasis in this thesis. Hampel (1974) proposed influence function that provides a useful tool for influential point diagnosis. In this thesis, we adopt Hampel’s technique and develop three types of influence functions of eigenvalue and eigenvector for symbolic interval-value data, namely empirical influence function, deleted empirical influence function and sample influence function. We illustrate these proposed methods with simulation studies and real data examples.
author2 Yu-fen Huang
author_facet Yu-fen Huang
Chih-Sheng Lin
林志昇
author Chih-Sheng Lin
林志昇
spellingShingle Chih-Sheng Lin
林志昇
Influence Analysis and Alternative Approaches for Principal Component Analysis in Symbolic Interval Data
author_sort Chih-Sheng Lin
title Influence Analysis and Alternative Approaches for Principal Component Analysis in Symbolic Interval Data
title_short Influence Analysis and Alternative Approaches for Principal Component Analysis in Symbolic Interval Data
title_full Influence Analysis and Alternative Approaches for Principal Component Analysis in Symbolic Interval Data
title_fullStr Influence Analysis and Alternative Approaches for Principal Component Analysis in Symbolic Interval Data
title_full_unstemmed Influence Analysis and Alternative Approaches for Principal Component Analysis in Symbolic Interval Data
title_sort influence analysis and alternative approaches for principal component analysis in symbolic interval data
publishDate 2015
url http://ndltd.ncl.edu.tw/handle/z6cw7t
work_keys_str_mv AT chihshenglin influenceanalysisandalternativeapproachesforprincipalcomponentanalysisinsymbolicintervaldata
AT línzhìshēng influenceanalysisandalternativeapproachesforprincipalcomponentanalysisinsymbolicintervaldata
_version_ 1719124008468742144