Robot grasp detection using multimodal deep convolutional neural networks

Autonomous manipulation has enabled a wide range of exciting robot tasks. However, perceiving outside environment is still a challenging problem in the field of intelligent robotic research due to the lack of object models, unstructured environments, and time-consuming computation. In this article,...

Full description

Bibliographic Details
Main Authors: Zhichao Wang, Zhiqi Li, Bin Wang, Hong Liu
Format: Article
Language:English
Published: SAGE Publishing 2016-09-01
Series:Advances in Mechanical Engineering
Online Access:https://doi.org/10.1177/1687814016668077
id doaj-dc7ab49b61e74a78ab281aa8fc212a82
record_format Article
spelling doaj-dc7ab49b61e74a78ab281aa8fc212a822020-11-25T04:02:41ZengSAGE PublishingAdvances in Mechanical Engineering1687-81402016-09-01810.1177/1687814016668077Robot grasp detection using multimodal deep convolutional neural networksZhichao WangZhiqi LiBin WangHong LiuAutonomous manipulation has enabled a wide range of exciting robot tasks. However, perceiving outside environment is still a challenging problem in the field of intelligent robotic research due to the lack of object models, unstructured environments, and time-consuming computation. In this article, we present a novel robot grasp detection system that maps a pair of RGB-D images of novel objects to best grasping pose of a robotic gripper. First, we segment the graspable objects from the unstructured scene using the geometrical features of both the object and the robotic gripper. Then, a deep convolutional neural network is applied on these graspable objects, which aims to find the best graspable area for each object. In order to improve the efficiency in the detection system, we introduce a structured penalty term to optimize the connections between multimodality, which significantly mitigates complexity of the network and outperforms fully connected multimodal processing. We also present a two-stage closed-loop grasping candidate estimator to accelerate the searching efficiency of grasping-candidate generation. Moreover, the combination of a two-stage estimator with the grasping detection network naturally improves detection accuracy. Experiments have been conducted to validate the proposed methods. The results show that our method outperforms the state of the art and runs at real-time speed.https://doi.org/10.1177/1687814016668077
collection DOAJ
language English
format Article
sources DOAJ
author Zhichao Wang
Zhiqi Li
Bin Wang
Hong Liu
spellingShingle Zhichao Wang
Zhiqi Li
Bin Wang
Hong Liu
Robot grasp detection using multimodal deep convolutional neural networks
Advances in Mechanical Engineering
author_facet Zhichao Wang
Zhiqi Li
Bin Wang
Hong Liu
author_sort Zhichao Wang
title Robot grasp detection using multimodal deep convolutional neural networks
title_short Robot grasp detection using multimodal deep convolutional neural networks
title_full Robot grasp detection using multimodal deep convolutional neural networks
title_fullStr Robot grasp detection using multimodal deep convolutional neural networks
title_full_unstemmed Robot grasp detection using multimodal deep convolutional neural networks
title_sort robot grasp detection using multimodal deep convolutional neural networks
publisher SAGE Publishing
series Advances in Mechanical Engineering
issn 1687-8140
publishDate 2016-09-01
description Autonomous manipulation has enabled a wide range of exciting robot tasks. However, perceiving outside environment is still a challenging problem in the field of intelligent robotic research due to the lack of object models, unstructured environments, and time-consuming computation. In this article, we present a novel robot grasp detection system that maps a pair of RGB-D images of novel objects to best grasping pose of a robotic gripper. First, we segment the graspable objects from the unstructured scene using the geometrical features of both the object and the robotic gripper. Then, a deep convolutional neural network is applied on these graspable objects, which aims to find the best graspable area for each object. In order to improve the efficiency in the detection system, we introduce a structured penalty term to optimize the connections between multimodality, which significantly mitigates complexity of the network and outperforms fully connected multimodal processing. We also present a two-stage closed-loop grasping candidate estimator to accelerate the searching efficiency of grasping-candidate generation. Moreover, the combination of a two-stage estimator with the grasping detection network naturally improves detection accuracy. Experiments have been conducted to validate the proposed methods. The results show that our method outperforms the state of the art and runs at real-time speed.
url https://doi.org/10.1177/1687814016668077
work_keys_str_mv AT zhichaowang robotgraspdetectionusingmultimodaldeepconvolutionalneuralnetworks
AT zhiqili robotgraspdetectionusingmultimodaldeepconvolutionalneuralnetworks
AT binwang robotgraspdetectionusingmultimodaldeepconvolutionalneuralnetworks
AT hongliu robotgraspdetectionusingmultimodaldeepconvolutionalneuralnetworks
_version_ 1724442582369435648