Global Clustering Quality Coefficient Assessing the Efficiency of PCA Class Assignment

An essential factor influencing the efficiency of the predictive models built with principal component analysis (PCA) is the quality of the data clustering revealed by the score plots. The sensitivity and selectivity of the class assignment are strongly influenced by the relative position of the clu...

Full description

Bibliographic Details
Main Authors: Mirela Praisler, Stefanut Ciochina
Format: Article
Language:English
Published: Hindawi Limited 2014-01-01
Series:Journal of Analytical Methods in Chemistry
Online Access:http://dx.doi.org/10.1155/2014/342497
id doaj-004cb3e4c6c548e09d41393db4914fd0
record_format Article
spelling doaj-004cb3e4c6c548e09d41393db4914fd02020-11-25T02:07:14ZengHindawi LimitedJournal of Analytical Methods in Chemistry2090-88652090-88732014-01-01201410.1155/2014/342497342497Global Clustering Quality Coefficient Assessing the Efficiency of PCA Class AssignmentMirela Praisler0Stefanut Ciochina1Department of Chemistry, Physics and Environment, “Dunarea de Jos” University of Galati, 800008 Galati, RomaniaDepartment of Chemistry, Physics and Environment, “Dunarea de Jos” University of Galati, 800008 Galati, RomaniaAn essential factor influencing the efficiency of the predictive models built with principal component analysis (PCA) is the quality of the data clustering revealed by the score plots. The sensitivity and selectivity of the class assignment are strongly influenced by the relative position of the clusters and by their dispersion. We are proposing a set of indicators inspired from analytical geometry that may be used for an objective quantitative assessment of the data clustering quality as well as a global clustering quality coefficient (GCQC) that is a measure of the overall predictive power of the PCA models. The use of these indicators for evaluating the efficiency of the PCA class assignment is illustrated by a comparative study performed for the identification of the preprocessing function that is generating the most efficient PCA system screening for amphetamines based on their GC-FTIR spectra. The GCQC ranking of the tested feature weights is explained based on estimated density distributions and validated by using quadratic discriminant analysis (QDA).http://dx.doi.org/10.1155/2014/342497
collection DOAJ
language English
format Article
sources DOAJ
author Mirela Praisler
Stefanut Ciochina
spellingShingle Mirela Praisler
Stefanut Ciochina
Global Clustering Quality Coefficient Assessing the Efficiency of PCA Class Assignment
Journal of Analytical Methods in Chemistry
author_facet Mirela Praisler
Stefanut Ciochina
author_sort Mirela Praisler
title Global Clustering Quality Coefficient Assessing the Efficiency of PCA Class Assignment
title_short Global Clustering Quality Coefficient Assessing the Efficiency of PCA Class Assignment
title_full Global Clustering Quality Coefficient Assessing the Efficiency of PCA Class Assignment
title_fullStr Global Clustering Quality Coefficient Assessing the Efficiency of PCA Class Assignment
title_full_unstemmed Global Clustering Quality Coefficient Assessing the Efficiency of PCA Class Assignment
title_sort global clustering quality coefficient assessing the efficiency of pca class assignment
publisher Hindawi Limited
series Journal of Analytical Methods in Chemistry
issn 2090-8865
2090-8873
publishDate 2014-01-01
description An essential factor influencing the efficiency of the predictive models built with principal component analysis (PCA) is the quality of the data clustering revealed by the score plots. The sensitivity and selectivity of the class assignment are strongly influenced by the relative position of the clusters and by their dispersion. We are proposing a set of indicators inspired from analytical geometry that may be used for an objective quantitative assessment of the data clustering quality as well as a global clustering quality coefficient (GCQC) that is a measure of the overall predictive power of the PCA models. The use of these indicators for evaluating the efficiency of the PCA class assignment is illustrated by a comparative study performed for the identification of the preprocessing function that is generating the most efficient PCA system screening for amphetamines based on their GC-FTIR spectra. The GCQC ranking of the tested feature weights is explained based on estimated density distributions and validated by using quadratic discriminant analysis (QDA).
url http://dx.doi.org/10.1155/2014/342497
work_keys_str_mv AT mirelapraisler globalclusteringqualitycoefficientassessingtheefficiencyofpcaclassassignment
AT stefanutciochina globalclusteringqualitycoefficientassessingtheefficiencyofpcaclassassignment
_version_ 1724930599282540544