RGB-D-Based Object Recognition Using Multimodal Convolutional Neural Networks: A Survey
Object recognition in real-world environments is one of the fundamental and key tasks in computer vision and robotics communities. With the advanced sensing technologies and low-cost depth sensors, the high-quality RGB and depth images can be recorded synchronously, and the object recognition perfor...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8683987/ |
id |
doaj-e136822c24114918ab8e11e350027253 |
---|---|
record_format |
Article |
spelling |
doaj-e136822c24114918ab8e11e3500272532021-03-29T22:48:17ZengIEEEIEEE Access2169-35362019-01-017431104313610.1109/ACCESS.2019.29070718683987RGB-D-Based Object Recognition Using Multimodal Convolutional Neural Networks: A SurveyMingliang Gao0https://orcid.org/0000-0001-7273-7499Jun Jiang1Guofeng Zou2https://orcid.org/0000-0002-8023-0142Vijay John3Zheng Liu4School of Electrical and Electronic Engineering, Shandong University of Technology, Zibo, ChinaSchool of Computer Science and Technology, Southwest Petroleum University, Chengdu, ChinaSchool of Electrical and Electronic Engineering, Shandong University of Technology, Zibo, ChinaIntelligent Information Processing Laboratory, Toyota Technological Institute, Nagoya, JapanFaculty of Applied Science, The University of British Columbia, Vancouver, CanadaObject recognition in real-world environments is one of the fundamental and key tasks in computer vision and robotics communities. With the advanced sensing technologies and low-cost depth sensors, the high-quality RGB and depth images can be recorded synchronously, and the object recognition performance can be improved by jointly exploiting them. RGB-D-based object recognition has evolved from early methods that using hand-crafted representations to the current state-of-the-art deep learning-based methods. With the undeniable success of deep learning, especially convolutional neural networks (CNNs) in the visual domain, the natural progression of deep learning research points to problems involving larger and more complex multimodal data. In this paper, we provide a comprehensive survey of recent multimodal CNNs (MMCNNs)-based approaches that have demonstrated significant improvements over previous methods. We highlight two key issues, namely, training data deficiency and multimodal fusion. In addition, we summarize and discuss the publicly available RGB-D object recognition datasets and present a comparative performance evaluation of the proposed methods on these benchmark datasets. Finally, we identify promising avenues of research in this rapidly evolving field. This survey will not only enable researchers to get a good overview of the state-of-the-art methods for RGB-D-based object recognition but also provide a reference for other multimodal machine learning applications, e.g., multimodal medical image fusion, audio-visual speech recognition, and multimedia retrieval and generation.https://ieeexplore.ieee.org/document/8683987/Convolutional neural networkmultimodal fusionobject recognitionRGB-Dsurvey |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Mingliang Gao Jun Jiang Guofeng Zou Vijay John Zheng Liu |
spellingShingle |
Mingliang Gao Jun Jiang Guofeng Zou Vijay John Zheng Liu RGB-D-Based Object Recognition Using Multimodal Convolutional Neural Networks: A Survey IEEE Access Convolutional neural network multimodal fusion object recognition RGB-D survey |
author_facet |
Mingliang Gao Jun Jiang Guofeng Zou Vijay John Zheng Liu |
author_sort |
Mingliang Gao |
title |
RGB-D-Based Object Recognition Using Multimodal Convolutional Neural Networks: A Survey |
title_short |
RGB-D-Based Object Recognition Using Multimodal Convolutional Neural Networks: A Survey |
title_full |
RGB-D-Based Object Recognition Using Multimodal Convolutional Neural Networks: A Survey |
title_fullStr |
RGB-D-Based Object Recognition Using Multimodal Convolutional Neural Networks: A Survey |
title_full_unstemmed |
RGB-D-Based Object Recognition Using Multimodal Convolutional Neural Networks: A Survey |
title_sort |
rgb-d-based object recognition using multimodal convolutional neural networks: a survey |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2019-01-01 |
description |
Object recognition in real-world environments is one of the fundamental and key tasks in computer vision and robotics communities. With the advanced sensing technologies and low-cost depth sensors, the high-quality RGB and depth images can be recorded synchronously, and the object recognition performance can be improved by jointly exploiting them. RGB-D-based object recognition has evolved from early methods that using hand-crafted representations to the current state-of-the-art deep learning-based methods. With the undeniable success of deep learning, especially convolutional neural networks (CNNs) in the visual domain, the natural progression of deep learning research points to problems involving larger and more complex multimodal data. In this paper, we provide a comprehensive survey of recent multimodal CNNs (MMCNNs)-based approaches that have demonstrated significant improvements over previous methods. We highlight two key issues, namely, training data deficiency and multimodal fusion. In addition, we summarize and discuss the publicly available RGB-D object recognition datasets and present a comparative performance evaluation of the proposed methods on these benchmark datasets. Finally, we identify promising avenues of research in this rapidly evolving field. This survey will not only enable researchers to get a good overview of the state-of-the-art methods for RGB-D-based object recognition but also provide a reference for other multimodal machine learning applications, e.g., multimodal medical image fusion, audio-visual speech recognition, and multimedia retrieval and generation. |
topic |
Convolutional neural network multimodal fusion object recognition RGB-D survey |
url |
https://ieeexplore.ieee.org/document/8683987/ |
work_keys_str_mv |
AT minglianggao rgbdbasedobjectrecognitionusingmultimodalconvolutionalneuralnetworksasurvey AT junjiang rgbdbasedobjectrecognitionusingmultimodalconvolutionalneuralnetworksasurvey AT guofengzou rgbdbasedobjectrecognitionusingmultimodalconvolutionalneuralnetworksasurvey AT vijayjohn rgbdbasedobjectrecognitionusingmultimodalconvolutionalneuralnetworksasurvey AT zhengliu rgbdbasedobjectrecognitionusingmultimodalconvolutionalneuralnetworksasurvey |
_version_ |
1724190854719995904 |