Composable Instructions and Prospection Guided Visuomotor Control for Robotic Manipulation
Deep neural network-based end-to-end visuomotor control for robotic manipulation is becoming a hot issue of robotics field recently. One-hot vector is often used for multi-task situation in this framework. However, it is inflexible using one-hot vector to describe multiple tasks and transmit intenti...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Atlantis Press
2019-10-01
|
Series: | International Journal of Computational Intelligence Systems |
Subjects: | |
Online Access: | https://www.atlantis-press.com/article/125921091/view |
id |
doaj-3fc47d6e6e764c5993a5c9240adddf76 |
---|---|
record_format |
Article |
spelling |
doaj-3fc47d6e6e764c5993a5c9240adddf762020-11-25T02:38:21ZengAtlantis PressInternational Journal of Computational Intelligence Systems 1875-68832019-10-0112210.2991/ijcis.d.191017.001Composable Instructions and Prospection Guided Visuomotor Control for Robotic ManipulationQuanquan ShaoJie HuWeiming WangYi FangMingshuo HanJin QiJin MaDeep neural network-based end-to-end visuomotor control for robotic manipulation is becoming a hot issue of robotics field recently. One-hot vector is often used for multi-task situation in this framework. However, it is inflexible using one-hot vector to describe multiple tasks and transmit intentions of humans. This paper proposes a framework by combining composable instructions with visuomotor control for multi-task problems. The framework mainly consists of two modules: variational autoencoder (VAE) networks and long short-term memory (LSTM) networks. Perception information of the environment is encoded by VAE into a small latent space. The embedded perception information and composable instructions are combined by the LSTM module to guide robotic motion based on different intentions. Prospection is also used to learn the purposes of instructions, which means not only predicting the next action but also predicting a sequence of future actions at the same time. To evaluate this framework, a series of experiments are conducted in pick-and-place application scenarios. For new tasks, the framework could obtain a success rate of 91.2%, which means it has a good generalization ability.https://www.atlantis-press.com/article/125921091/viewComposable instructionsMotion generationProspectionImitation learningVisuomotor controlRobotic manipulation |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Quanquan Shao Jie Hu Weiming Wang Yi Fang Mingshuo Han Jin Qi Jin Ma |
spellingShingle |
Quanquan Shao Jie Hu Weiming Wang Yi Fang Mingshuo Han Jin Qi Jin Ma Composable Instructions and Prospection Guided Visuomotor Control for Robotic Manipulation International Journal of Computational Intelligence Systems Composable instructions Motion generation Prospection Imitation learning Visuomotor control Robotic manipulation |
author_facet |
Quanquan Shao Jie Hu Weiming Wang Yi Fang Mingshuo Han Jin Qi Jin Ma |
author_sort |
Quanquan Shao |
title |
Composable Instructions and Prospection Guided Visuomotor Control for Robotic Manipulation |
title_short |
Composable Instructions and Prospection Guided Visuomotor Control for Robotic Manipulation |
title_full |
Composable Instructions and Prospection Guided Visuomotor Control for Robotic Manipulation |
title_fullStr |
Composable Instructions and Prospection Guided Visuomotor Control for Robotic Manipulation |
title_full_unstemmed |
Composable Instructions and Prospection Guided Visuomotor Control for Robotic Manipulation |
title_sort |
composable instructions and prospection guided visuomotor control for robotic manipulation |
publisher |
Atlantis Press |
series |
International Journal of Computational Intelligence Systems |
issn |
1875-6883 |
publishDate |
2019-10-01 |
description |
Deep neural network-based end-to-end visuomotor control for robotic manipulation is becoming a hot issue of robotics field recently. One-hot vector is often used for multi-task situation in this framework. However, it is inflexible using one-hot vector to describe multiple tasks and transmit intentions of humans. This paper proposes a framework by combining composable instructions with visuomotor control for multi-task problems. The framework mainly consists of two modules: variational autoencoder (VAE) networks and long short-term memory (LSTM) networks. Perception information of the environment is encoded by VAE into a small latent space. The embedded perception information and composable instructions are combined by the LSTM module to guide robotic motion based on different intentions. Prospection is also used to learn the purposes of instructions, which means not only predicting the next action but also predicting a sequence of future actions at the same time. To evaluate this framework, a series of experiments are conducted in pick-and-place application scenarios. For new tasks, the framework could obtain a success rate of 91.2%, which means it has a good generalization ability. |
topic |
Composable instructions Motion generation Prospection Imitation learning Visuomotor control Robotic manipulation |
url |
https://www.atlantis-press.com/article/125921091/view |
work_keys_str_mv |
AT quanquanshao composableinstructionsandprospectionguidedvisuomotorcontrolforroboticmanipulation AT jiehu composableinstructionsandprospectionguidedvisuomotorcontrolforroboticmanipulation AT weimingwang composableinstructionsandprospectionguidedvisuomotorcontrolforroboticmanipulation AT yifang composableinstructionsandprospectionguidedvisuomotorcontrolforroboticmanipulation AT mingshuohan composableinstructionsandprospectionguidedvisuomotorcontrolforroboticmanipulation AT jinqi composableinstructionsandprospectionguidedvisuomotorcontrolforroboticmanipulation AT jinma composableinstructionsandprospectionguidedvisuomotorcontrolforroboticmanipulation |
_version_ |
1724791474095128576 |