Composable Instructions and Prospection Guided Visuomotor Control for Robotic Manipulation

Deep neural network-based end-to-end visuomotor control for robotic manipulation is becoming a hot issue of robotics field recently. One-hot vector is often used for multi-task situation in this framework. However, it is inflexible using one-hot vector to describe multiple tasks and transmit intenti...

Full description

Bibliographic Details
Main Authors: Quanquan Shao, Jie Hu, Weiming Wang, Yi Fang, Mingshuo Han, Jin Qi, Jin Ma
Format: Article
Language:English
Published: Atlantis Press 2019-10-01
Series:International Journal of Computational Intelligence Systems
Subjects:
Online Access:https://www.atlantis-press.com/article/125921091/view
id doaj-3fc47d6e6e764c5993a5c9240adddf76
record_format Article
spelling doaj-3fc47d6e6e764c5993a5c9240adddf762020-11-25T02:38:21ZengAtlantis PressInternational Journal of Computational Intelligence Systems 1875-68832019-10-0112210.2991/ijcis.d.191017.001Composable Instructions and Prospection Guided Visuomotor Control for Robotic ManipulationQuanquan ShaoJie HuWeiming WangYi FangMingshuo HanJin QiJin MaDeep neural network-based end-to-end visuomotor control for robotic manipulation is becoming a hot issue of robotics field recently. One-hot vector is often used for multi-task situation in this framework. However, it is inflexible using one-hot vector to describe multiple tasks and transmit intentions of humans. This paper proposes a framework by combining composable instructions with visuomotor control for multi-task problems. The framework mainly consists of two modules: variational autoencoder (VAE) networks and long short-term memory (LSTM) networks. Perception information of the environment is encoded by VAE into a small latent space. The embedded perception information and composable instructions are combined by the LSTM module to guide robotic motion based on different intentions. Prospection is also used to learn the purposes of instructions, which means not only predicting the next action but also predicting a sequence of future actions at the same time. To evaluate this framework, a series of experiments are conducted in pick-and-place application scenarios. For new tasks, the framework could obtain a success rate of 91.2%, which means it has a good generalization ability.https://www.atlantis-press.com/article/125921091/viewComposable instructionsMotion generationProspectionImitation learningVisuomotor controlRobotic manipulation
collection DOAJ
language English
format Article
sources DOAJ
author Quanquan Shao
Jie Hu
Weiming Wang
Yi Fang
Mingshuo Han
Jin Qi
Jin Ma
spellingShingle Quanquan Shao
Jie Hu
Weiming Wang
Yi Fang
Mingshuo Han
Jin Qi
Jin Ma
Composable Instructions and Prospection Guided Visuomotor Control for Robotic Manipulation
International Journal of Computational Intelligence Systems
Composable instructions
Motion generation
Prospection
Imitation learning
Visuomotor control
Robotic manipulation
author_facet Quanquan Shao
Jie Hu
Weiming Wang
Yi Fang
Mingshuo Han
Jin Qi
Jin Ma
author_sort Quanquan Shao
title Composable Instructions and Prospection Guided Visuomotor Control for Robotic Manipulation
title_short Composable Instructions and Prospection Guided Visuomotor Control for Robotic Manipulation
title_full Composable Instructions and Prospection Guided Visuomotor Control for Robotic Manipulation
title_fullStr Composable Instructions and Prospection Guided Visuomotor Control for Robotic Manipulation
title_full_unstemmed Composable Instructions and Prospection Guided Visuomotor Control for Robotic Manipulation
title_sort composable instructions and prospection guided visuomotor control for robotic manipulation
publisher Atlantis Press
series International Journal of Computational Intelligence Systems
issn 1875-6883
publishDate 2019-10-01
description Deep neural network-based end-to-end visuomotor control for robotic manipulation is becoming a hot issue of robotics field recently. One-hot vector is often used for multi-task situation in this framework. However, it is inflexible using one-hot vector to describe multiple tasks and transmit intentions of humans. This paper proposes a framework by combining composable instructions with visuomotor control for multi-task problems. The framework mainly consists of two modules: variational autoencoder (VAE) networks and long short-term memory (LSTM) networks. Perception information of the environment is encoded by VAE into a small latent space. The embedded perception information and composable instructions are combined by the LSTM module to guide robotic motion based on different intentions. Prospection is also used to learn the purposes of instructions, which means not only predicting the next action but also predicting a sequence of future actions at the same time. To evaluate this framework, a series of experiments are conducted in pick-and-place application scenarios. For new tasks, the framework could obtain a success rate of 91.2%, which means it has a good generalization ability.
topic Composable instructions
Motion generation
Prospection
Imitation learning
Visuomotor control
Robotic manipulation
url https://www.atlantis-press.com/article/125921091/view
work_keys_str_mv AT quanquanshao composableinstructionsandprospectionguidedvisuomotorcontrolforroboticmanipulation
AT jiehu composableinstructionsandprospectionguidedvisuomotorcontrolforroboticmanipulation
AT weimingwang composableinstructionsandprospectionguidedvisuomotorcontrolforroboticmanipulation
AT yifang composableinstructionsandprospectionguidedvisuomotorcontrolforroboticmanipulation
AT mingshuohan composableinstructionsandprospectionguidedvisuomotorcontrolforroboticmanipulation
AT jinqi composableinstructionsandprospectionguidedvisuomotorcontrolforroboticmanipulation
AT jinma composableinstructionsandprospectionguidedvisuomotorcontrolforroboticmanipulation
_version_ 1724791474095128576