Graph-based Latent Embedding, Annotation and Representation Learning in Neural Networks for Semi-supervised and Unsupervised Settings

Machine learning has been immensely successful in supervised learning with outstanding examples in major industrial applications such as voice and image recognition. Following these developments, the most recent research has now begun to focus primarily on algorithms which can exploit very...

Full description

Bibliographic Details
Main Author: Kilinc, Ismail Ozsel
Format: Others
Published: Scholar Commons 2017
Subjects:
Online Access:https://scholarcommons.usf.edu/etd/7415
https://scholarcommons.usf.edu/cgi/viewcontent.cgi?article=8612&context=etd
Description
Summary:Machine learning has been immensely successful in supervised learning with outstanding examples in major industrial applications such as voice and image recognition. Following these developments, the most recent research has now begun to focus primarily on algorithms which can exploit very large sets of unlabeled examples to reduce the amount of manually labeled data required for existing models to perform well. In this dissertation, we propose graph-based latent embedding/annotation/representation learning techniques in neural networks tailored for semi-supervised and unsupervised learning problems. Specifically, we propose a novel regularization technique called Graph-based Activity Regularization (GAR) and a novel output layer modification called Auto-clustering Output Layer (ACOL) which can be used separately or collaboratively to develop scalable and efficient learning frameworks for semi-supervised and unsupervised settings. First, singularly using the GAR technique, we develop a framework providing an effective and scalable graph-based solution for semi-supervised settings in which there exists a large number of observations but a small subset with ground-truth labels. The proposed approach is natural for the classification framework on neural networks as it requires no additional task calculating the reconstruction error (as in autoencoder based methods) or implementing zero-sum game mechanism (as in adversarial training based methods). We demonstrate that GAR effectively and accurately propagates the available labels to unlabeled examples. Our results show comparable performance with state-of-the-art generative approaches for this setting using an easier-to-train framework. Second, we explore a different type of semi-supervised setting where a coarse level of labeling is available for all the observations but the model has to learn a fine, deeper level of latent annotations for each one. Problems in this setting are likely to be encountered in many domains such as text categorization, protein function prediction, image classification as well as in exploratory scientific studies such as medical and genomics research. We consider this setting as simultaneously performed supervised classification (per the available coarse labels) and unsupervised clustering (within each one of the coarse labels) and propose a novel framework combining GAR with ACOL, which enables the network to perform concurrent classification and clustering. We demonstrate how the coarse label supervision impacts performance and the classification task actually helps propagate useful clustering information between sub-classes. Comparative tests on the most popular image datasets rigorously demonstrate the effectiveness and competitiveness of the proposed approach. The third and final setup builds on the prior framework to unlock fully unsupervised learning where we propose to substitute real, yet unavailable, parent- class information with pseudo class labels. In this novel unsupervised clustering approach the network can exploit hidden information indirectly introduced through a pseudo classification objective. We train an ACOL network through this pseudo supervision together with unsupervised objective based on GAR and ultimately obtain a k-means friendly latent representation. Furthermore, we demonstrate how the chosen transformation type impacts performance and helps propagate the latent information that is useful in revealing unknown clusters. Our results show state-of-the-art performance for unsupervised clustering tasks on MNIST, SVHN and USPS datasets with the highest accuracies reported to date in the literature.