Data famine in big data era : machine learning algorithms for visual object recognition with limited training data

Big data is an increasingly attractive concept in many fields both in academia and in industry. The increasing amount of information actually builds an illusion that we are going to have enough data to solve all the data driven problems. Unfortunately it is not true, especially for areas where machi...

Full description

Bibliographic Details
Main Author: Guo, Zhenyu
Language:English
Published: University of British Columbia 2014
Online Access:http://hdl.handle.net/2429/46412
Description
Summary:Big data is an increasingly attractive concept in many fields both in academia and in industry. The increasing amount of information actually builds an illusion that we are going to have enough data to solve all the data driven problems. Unfortunately it is not true, especially for areas where machine learning methods are heavily employed, since sufficient high-quality training data doesn't necessarily come with the big data, and it is not easy or sometimes impossible to collect sufficient training samples, which most computational algorithms depend on. This thesis mainly focuses on dealing situations with limited training data in visual object recognition, by developing novel machine learning algorithms to overcome the limited training data difficulty. We investigate three issues in object recognition involving limited training data: 1. one-shot object recognition, 2. cross-domain object recognition, and 3. object recognition for images with different picture styles. For Issue 1, we propose an unsupervised feature learning algorithm by constructing a deep structure of the stacked Hierarchical Dirichlet Process (HDP) auto-encoder, in order to extract "semantic" information from unlabeled source images. For Issue 2, we propose a Domain Adaptive Input-Output Kernel Learning algorithm to reduce the domain shifts in both input and output spaces. For Issue 3, we introduce a new problem involving images with different picture styles, successfully formulate the relationship between pixel mapping functions with gradient based image descriptors, and also propose a multiple kernel based algorithm to learn an optimal combination of basis pixel mapping functions to improve the recognition accuracy. For all the proposed algorithms, experimental results on publicly available data sets demonstrate the performance improvements over previous state-of-arts.