Permutation tests to estimate significances on Principal Components Analysis

Principal Component Analysis is the most widely used multivariate technique to summarize information in a data collection with many variables. However, for it to be valid and useful the meaningful information must be retained and the noisy information must be sorted out. To achieve it an index from...

Full description

Bibliographic Details
Main Author: Vasco M. N. C. S. Vieira
Format: Article
Language:English
Published: International Academy of Ecology and Environmental Sciences 2012-06-01
Series:Computational Ecology and Software
Subjects:
Online Access:http://www.iaees.org/publications/journals/ces/articles/2012-2(2)/permutation-tests-to-estimate-significances.pdf
id doaj-68a5a40dd8a14d8b8a988f2a0c95a5d6
record_format Article
spelling doaj-68a5a40dd8a14d8b8a988f2a0c95a5d62020-11-24T22:57:30ZengInternational Academy of Ecology and Environmental SciencesComputational Ecology and Software2220-721X2012-06-0122103123Permutation tests to estimate significances on Principal Components AnalysisVasco M. N. C. S. VieiraPrincipal Component Analysis is the most widely used multivariate technique to summarize information in a data collection with many variables. However, for it to be valid and useful the meaningful information must be retained and the noisy information must be sorted out. To achieve it an index from the original data set isestimated, after which three classes of methodologies may be used: (i) the analytical solution to the distribution of the index under the assumption the data has a multivariate normal distribution, (ii) the numerical solution to the distribution of the index by means of permutation tests without any assumption about the data distributionand (iii) the bootstrap numerical solution to the percentiles of the index and the comparison to its assumed value for the null hypothesis without any assumption about the data distribution. New indices are proposed to be used with permutation tests and compared with previous ones from application to several data sets. Theiradvantages and draw-backs are discussed together with the adequacy of permutation tests and inadequacy of both bootstrap techniques and methods that rely on the assumption of multivariate normal distributions.http://www.iaees.org/publications/journals/ces/articles/2012-2(2)/permutation-tests-to-estimate-significances.pdfmultivariatepermutation testsprincipal components analysisrandomizationsignificancestopping rules
collection DOAJ
language English
format Article
sources DOAJ
author Vasco M. N. C. S. Vieira
spellingShingle Vasco M. N. C. S. Vieira
Permutation tests to estimate significances on Principal Components Analysis
Computational Ecology and Software
multivariate
permutation tests
principal components analysis
randomization
significance
stopping rules
author_facet Vasco M. N. C. S. Vieira
author_sort Vasco M. N. C. S. Vieira
title Permutation tests to estimate significances on Principal Components Analysis
title_short Permutation tests to estimate significances on Principal Components Analysis
title_full Permutation tests to estimate significances on Principal Components Analysis
title_fullStr Permutation tests to estimate significances on Principal Components Analysis
title_full_unstemmed Permutation tests to estimate significances on Principal Components Analysis
title_sort permutation tests to estimate significances on principal components analysis
publisher International Academy of Ecology and Environmental Sciences
series Computational Ecology and Software
issn 2220-721X
publishDate 2012-06-01
description Principal Component Analysis is the most widely used multivariate technique to summarize information in a data collection with many variables. However, for it to be valid and useful the meaningful information must be retained and the noisy information must be sorted out. To achieve it an index from the original data set isestimated, after which three classes of methodologies may be used: (i) the analytical solution to the distribution of the index under the assumption the data has a multivariate normal distribution, (ii) the numerical solution to the distribution of the index by means of permutation tests without any assumption about the data distributionand (iii) the bootstrap numerical solution to the percentiles of the index and the comparison to its assumed value for the null hypothesis without any assumption about the data distribution. New indices are proposed to be used with permutation tests and compared with previous ones from application to several data sets. Theiradvantages and draw-backs are discussed together with the adequacy of permutation tests and inadequacy of both bootstrap techniques and methods that rely on the assumption of multivariate normal distributions.
topic multivariate
permutation tests
principal components analysis
randomization
significance
stopping rules
url http://www.iaees.org/publications/journals/ces/articles/2012-2(2)/permutation-tests-to-estimate-significances.pdf
work_keys_str_mv AT vascomncsvieira permutationteststoestimatesignificancesonprincipalcomponentsanalysis
_version_ 1725650564041146368