Sparse Generalized PCA and Dependency Learning for Large-Scale Applications Beyond Gaussianity

The age of big data has re-invited much interest in dimension reduction. How to cope with high-dimensional data remains a difficult problem in statistical learning. In this study, we consider the task of dimension reduction---projecting data into a lower-rank subspace while p...

Full description

Bibliographic Details
Other Authors: Zhang, Qiaoya (authoraut)
Format: Others
Language:English
English
Published: Florida State University
Subjects:
Online Access:http://purl.flvc.org/fsu/fd/FSU_2016SP_Zhang_fsu_0071E_13087
Description
Summary:The age of big data has re-invited much interest in dimension reduction. How to cope with high-dimensional data remains a difficult problem in statistical learning. In this study, we consider the task of dimension reduction---projecting data into a lower-rank subspace while preserving maximal information. We investigate the pitfalls of classical PCA, and propose a set of algorithm that functions under high dimension, extends to all exponential family distributions, performs feature selection at the mean time, and takes missing value into consideration. Based upon the best performing one, we develop the SG-PCA algorithm. With acceleration techniques and a progressive screening scheme, it demonstrates superior scalability and accuracy compared to existing methods. Concerned with the independence assumption of dimension reduction techniques, we propose a novel framework, the Generalized Indirect Dependency Learning (GIDL), to learn and incorporate association structure in multivariate statistical analysis. Without constraints on the particular distribution of the data, GIDL takes any pre-specified smooth loss function and is able to both extract and infuse its association into the regression, classification or dimension reduction problem. Experiments at the end serve to demonstrate its efficacy. === A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. === Spring Semester 2016. === March 29, 2016. === Includes bibliographical references. === Yiyuan She, Professor Directing Dissertation; Teng Ma, University Representative; Xufeng Niu, Committee Member; Debajyoti Sinha, Committee Member; Elizabeth Slate, Committee Member.