Discovery of Latent Factors in High-dimensional Data Using Tensor Methods

<p>Unsupervised learning aims at the discovery of hidden structure that drives the observations in the real world. It is essential for success in modern machine learning and artificial intelligence. Latent variable models are versatile in unsupervised learning and have applications in almost e...

Full description

Bibliographic Details
Main Author:	Huang, Furong
Language:	EN
Published:	University of California, Irvine 2016
Subjects:	Computer science
Online Access:	http://pqdtopen.proquest.com/#viewpdf?dispub=10125323

id	ndltd-PROQUEST-oai-pqdtoai.proquest.com-10125323
record_format	oai_dc
spelling	ndltd-PROQUEST-oai-pqdtoai.proquest.com-101253232016-06-16T15:58:22Z Discovery of Latent Factors in High-dimensional Data Using Tensor Methods Huang, Furong Computer science <p>Unsupervised learning aims at the discovery of hidden structure that drives the observations in the real world. It is essential for success in modern machine learning and artificial intelligence. Latent variable models are versatile in unsupervised learning and have applications in almost every domain, e.g., social network analysis, natural language processing, computer vision and computational biology. Training latent variable models is challenging due to the non-convexity of the likelihood objective function. An alternative method is based on the spectral decomposition of low order moment matrices and tensors. This versatile framework is guaranteed to estimate the correct model consistently. My thesis spans both theoretical analysis of tensor decomposition framework and practical implementation of various applications. </p><p> This thesis presents theoretical results on convergence to globally optimal solution of tensor decomposition using the stochastic gradient descent, despite non-convexity of the objective. This is the first work that gives global convergence guarantees for the stochastic gradient descent on non-convex functions with exponentially many local minima and saddle points. </p><p> This thesis also presents large-scale deployment of spectral methods (matrix and tensor decomposition) carried out on CPU, GPU and Spark platforms. Dimensionality reduction techniques such as random projection are incorporated for a highly parallel and scalable tensor decomposition algorithm. We obtain a gain in both accuracies and in running times by several orders of magnitude compared to the state-of-art variational methods. </p><p> To solve real world problems, more advanced models and learning algorithms are proposed. After introducing tensor decomposition framework under latent Dirichlet allocation (LDA) model, this thesis discusses generalization of LDA model to mixed membership stochastic block model for learning hidden user commonalities or communities in social network, convolutional dictionary model for learning phrase templates and word-sequence embeddings, hierarchical tensor decomposition and latent tree structure model for learning disease hierarchy in healthcare analytics, and spatial point process mixture model for detecting cell types in neuroscience. </p> University of California, Irvine 2016-06-15 00:00:00.0 thesis http://pqdtopen.proquest.com/#viewpdf?dispub=10125323 EN
collection	NDLTD
language	EN
sources	NDLTD
topic	Computer science
spellingShingle	Computer science Huang, Furong Discovery of Latent Factors in High-dimensional Data Using Tensor Methods
description	<p>Unsupervised learning aims at the discovery of hidden structure that drives the observations in the real world. It is essential for success in modern machine learning and artificial intelligence. Latent variable models are versatile in unsupervised learning and have applications in almost every domain, e.g., social network analysis, natural language processing, computer vision and computational biology. Training latent variable models is challenging due to the non-convexity of the likelihood objective function. An alternative method is based on the spectral decomposition of low order moment matrices and tensors. This versatile framework is guaranteed to estimate the correct model consistently. My thesis spans both theoretical analysis of tensor decomposition framework and practical implementation of various applications. </p><p> This thesis presents theoretical results on convergence to globally optimal solution of tensor decomposition using the stochastic gradient descent, despite non-convexity of the objective. This is the first work that gives global convergence guarantees for the stochastic gradient descent on non-convex functions with exponentially many local minima and saddle points. </p><p> This thesis also presents large-scale deployment of spectral methods (matrix and tensor decomposition) carried out on CPU, GPU and Spark platforms. Dimensionality reduction techniques such as random projection are incorporated for a highly parallel and scalable tensor decomposition algorithm. We obtain a gain in both accuracies and in running times by several orders of magnitude compared to the state-of-art variational methods. </p><p> To solve real world problems, more advanced models and learning algorithms are proposed. After introducing tensor decomposition framework under latent Dirichlet allocation (LDA) model, this thesis discusses generalization of LDA model to mixed membership stochastic block model for learning hidden user commonalities or communities in social network, convolutional dictionary model for learning phrase templates and word-sequence embeddings, hierarchical tensor decomposition and latent tree structure model for learning disease hierarchy in healthcare analytics, and spatial point process mixture model for detecting cell types in neuroscience. </p>
author	Huang, Furong
author_facet	Huang, Furong
author_sort	Huang, Furong
title	Discovery of Latent Factors in High-dimensional Data Using Tensor Methods
title_short	Discovery of Latent Factors in High-dimensional Data Using Tensor Methods
title_full	Discovery of Latent Factors in High-dimensional Data Using Tensor Methods
title_fullStr	Discovery of Latent Factors in High-dimensional Data Using Tensor Methods
title_full_unstemmed	Discovery of Latent Factors in High-dimensional Data Using Tensor Methods
title_sort	discovery of latent factors in high-dimensional data using tensor methods
publisher	University of California, Irvine
publishDate	2016
url	http://pqdtopen.proquest.com/#viewpdf?dispub=10125323
work_keys_str_mv	AT huangfurong discoveryoflatentfactorsinhighdimensionaldatausingtensormethods
_version_	1718306430622105600

Discovery of Latent Factors in High-dimensional Data Using Tensor Methods

Similar Items