Sparse Factor Analysis and Its Applications

博士 === 中興大學 === 應用數學系所 === 103 === Recent modern data set, such as genomic data and image data, often generate huge amount of information. A critical challenging component in analyzing high-dimensional data is how to reduce the dimension of data and how to extract relevant features. Hence we propose...

Full description

Bibliographic Details
Main Authors: Po-Yu Huang, 黃博煜
Other Authors: 許英麟
Format: Others
Language:en_US
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/13395304912843724581
Description
Summary:博士 === 中興大學 === 應用數學系所 === 103 === Recent modern data set, such as genomic data and image data, often generate huge amount of information. A critical challenging component in analyzing high-dimensional data is how to reduce the dimension of data and how to extract relevant features. Hence we propose a simultaneously sparse factor analysis approach (SSFA) to tackle the problems by employing L1 penalty function to promote sparseness in factor loadings. For the application to clustering, we provide two clustering approaches based on SSFA: (1) Cutoff-split approach with excluding some non-separable patients via imposing another L1 penalty function in factor scores (cutoff-based clustering); (2) The mixture of SSFA (mixture-based clustering). Simulation results show that the SSFA yields a smaller bias and variance compared to other sparse approaches as well as lower classification error rate than other mixture model with dimension reduction methods. Application to a published gene signature in two unique lung cancer datasets demonstrates the utility of two types clustering approaches with SSFA in helping refine the gene signature and improve classification to better predict risk of cancer death and treatment benefit.