Regularized Discriminant Analysis: A Large Dimensional Study

In this thesis, we focus on studying the performance of general regularized discriminant analysis (RDA) classifiers. The data used for analysis is assumed to follow Gaussian mixture model with different means and covariances. RDA offers a rich class of regularization options, covering as special cas...

Full description

Bibliographic Details
Main Author: Yang, Xiaoke
Other Authors: Al-Naffouri, Tareq Y.
Language:en
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10754/627734
id ndltd-kaust.edu.sa-oai-repository.kaust.edu.sa-10754-627734
record_format oai_dc
spelling ndltd-kaust.edu.sa-oai-repository.kaust.edu.sa-10754-6277342019-09-18T03:08:19Z Regularized Discriminant Analysis: A Large Dimensional Study Yang, Xiaoke Al-Naffouri, Tareq Y. Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division Alouini, Mohamed-Slim Zhang, Xiangliang Random matrix theory Machine Learning discriminant Analysis In this thesis, we focus on studying the performance of general regularized discriminant analysis (RDA) classifiers. The data used for analysis is assumed to follow Gaussian mixture model with different means and covariances. RDA offers a rich class of regularization options, covering as special cases the regularized linear discriminant analysis (RLDA) and the regularized quadratic discriminant analysis (RQDA) classi ers. We analyze RDA under the double asymptotic regime where the data dimension and the training size both increase in a proportional way. This double asymptotic regime allows for application of fundamental results from random matrix theory. Under the double asymptotic regime and some mild assumptions, we show that the asymptotic classification error converges to a deterministic quantity that only depends on the data statistical parameters and dimensions. This result not only implicates some mathematical relations between the misclassification error and the class statistics, but also can be leveraged to select the optimal parameters that minimize the classification error, thus yielding the optimal classifier. Validation results on the synthetic data show a good accuracy of our theoretical findings. We also construct a general consistent estimator to approximate the true classification error in consideration of the unknown previous statistics. We benchmark the performance of our proposed consistent estimator against classical estimator on synthetic data. The observations demonstrate that the general estimator outperforms others in terms of mean squared error (MSE). 2018-05-02T05:17:56Z 2018-05-02T05:17:56Z 2018-04-28 Thesis 10.25781/KAUST-Z11W4 http://hdl.handle.net/10754/627734 en
collection NDLTD
language en
sources NDLTD
topic Random matrix theory
Machine Learning
discriminant Analysis
spellingShingle Random matrix theory
Machine Learning
discriminant Analysis
Yang, Xiaoke
Regularized Discriminant Analysis: A Large Dimensional Study
description In this thesis, we focus on studying the performance of general regularized discriminant analysis (RDA) classifiers. The data used for analysis is assumed to follow Gaussian mixture model with different means and covariances. RDA offers a rich class of regularization options, covering as special cases the regularized linear discriminant analysis (RLDA) and the regularized quadratic discriminant analysis (RQDA) classi ers. We analyze RDA under the double asymptotic regime where the data dimension and the training size both increase in a proportional way. This double asymptotic regime allows for application of fundamental results from random matrix theory. Under the double asymptotic regime and some mild assumptions, we show that the asymptotic classification error converges to a deterministic quantity that only depends on the data statistical parameters and dimensions. This result not only implicates some mathematical relations between the misclassification error and the class statistics, but also can be leveraged to select the optimal parameters that minimize the classification error, thus yielding the optimal classifier. Validation results on the synthetic data show a good accuracy of our theoretical findings. We also construct a general consistent estimator to approximate the true classification error in consideration of the unknown previous statistics. We benchmark the performance of our proposed consistent estimator against classical estimator on synthetic data. The observations demonstrate that the general estimator outperforms others in terms of mean squared error (MSE).
author2 Al-Naffouri, Tareq Y.
author_facet Al-Naffouri, Tareq Y.
Yang, Xiaoke
author Yang, Xiaoke
author_sort Yang, Xiaoke
title Regularized Discriminant Analysis: A Large Dimensional Study
title_short Regularized Discriminant Analysis: A Large Dimensional Study
title_full Regularized Discriminant Analysis: A Large Dimensional Study
title_fullStr Regularized Discriminant Analysis: A Large Dimensional Study
title_full_unstemmed Regularized Discriminant Analysis: A Large Dimensional Study
title_sort regularized discriminant analysis: a large dimensional study
publishDate 2018
url http://hdl.handle.net/10754/627734
work_keys_str_mv AT yangxiaoke regularizeddiscriminantanalysisalargedimensionalstudy
_version_ 1719251689546973184