Multimodal Deep Learning for Multi-Label Classification and Ranking Problems

In recent years, deep neural network models have shown to outperform many state of the art algorithms. The reason for this is, unsupervised pretraining with multi-layered deep neural networks have shown to learn better features, which further improves many supervised tasks. These models not only aut...

Full description

Bibliographic Details
Main Author:	Dubey, Abhishek
Other Authors:	Dukkipati, Ambedkar
Language:	en_US
Published:	2018
Subjects:	Neural Networks Deep Neural Network Models Neural Network Architecture Multimodal Deep Neural Networks Multimodal Deep Learning Multi-Label Classification (MLC) Multi-class Classification (MCC) Label Ranking Multimodal Neural Networks Supervised Learning Multilayer Neural Network Perceptron Model Computer Science
Online Access:	http://etd.iisc.ernet.in/2005/3681 http://etd.iisc.ernet.in/abstracts/4551/G26906-Abs.pdf

Description
Summary:	In recent years, deep neural network models have shown to outperform many state of the art algorithms. The reason for this is, unsupervised pretraining with multi-layered deep neural networks have shown to learn better features, which further improves many supervised tasks. These models not only automate the feature extraction process but also provide with robust features for various machine learning tasks. But the unsupervised pretraining and feature extraction using multi-layered networks are restricted only to the input features and not to the output. The performance of many supervised learning algorithms (or models) depends on how well the output dependencies are handled by these algorithms [Dembczy´nski et al., 2012]. Adapting the standard neural networks to handle these output dependencies for any speciﬁc type of problem has been an active area of research [Zhang and Zhou, 2006, Ribeiro et al., 2012]. On the other hand, inference into multimodal data is considered as a difﬁcult problem in machine learning and recently ‘deep multimodal neural networks’ have shown signiﬁcant results [Ngiam et al., 2011, Srivastava and Salakhutdinov, 2012]. Several problems like classiﬁcation with complete or missing modality data, generating the missing modality etc., are shown to perform very well with these models. In this work, we consider three nontrivial supervised learning tasks (i) multi-class classiﬁcation (MCC), (ii) multi-label classiﬁcation (MLC) and (iii) label ranking (LR), mentioned in the order of increasing complexity of the output. While multi-class classiﬁcation deals with predicting one class for every instance, multi-label classiﬁcation deals with predicting more than one classes for every instance and label ranking deals with assigning a rank to each label for every instance. All the work in this ﬁeld is associated around formulating new error functions that can force network to identify the output dependencies. Aim of our work is to adapt neural network to implicitly handle the feature extraction (dependencies) for output in the network structure, removing the need of hand crafted error functions. We show that the multimodal deep architectures can be adapted for these type of problems (or data) by considering labels as one of the modalities. This also brings unsupervised pretraining to the output along with the input. We show that these models can not only outperform standard deep neural networks, but also outperform standard adaptations of neural networks for individual domains under various metrics over several data sets considered by us. We can observe that the performance of our models over other models improves even more as the complexity of the output/ problem increases.

Multimodal Deep Learning for Multi-Label Classification and Ranking Problems

Similar Items