Representing, learning, and controlling complex object interactions

We present a framework for representing scenarios with complex object interactions, where a robot cannot directly interact with the object it wishes to control and must instead influence it via intermediate objects. For instance, a robot learning to drive a car can only change the car's pose in...

Full description

Bibliographic Details
Main Authors:	Zhou, Yilun (Contributor), Burchfiel, Benjamin (Author), Konidaris, George (Author)
Other Authors:	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Contributor)
Format:	Article
Language:	English
Published:	Springer US, 2018-05-29T14:12:19Z.
Subjects:	Article
Online Access:	Get fulltext


LEADER	02073 am a22002173u 4500
001	115928
042			\|a dc
100	1	0	\|a Zhou, Yilun \|e author
100	1	0	\|a Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science \|e contributor
100	1	0	\|a Zhou, Yilun \|e contributor
700	1	0	\|a Burchfiel, Benjamin \|e author
700	1	0	\|a Konidaris, George \|e author
245	0	0	\|a Representing, learning, and controlling complex object interactions
260			\|b Springer US, \|c 2018-05-29T14:12:19Z.
856			\|z Get fulltext \|u http://hdl.handle.net/1721.1/115928
520			\|a We present a framework for representing scenarios with complex object interactions, where a robot cannot directly interact with the object it wishes to control and must instead influence it via intermediate objects. For instance, a robot learning to drive a car can only change the car's pose indirectly via the steering wheel, and must represent and reason about the relationship between its own grippers and the steering wheel, and the relationship between the steering wheel and the car. We formalize these interactions as chains and graphs of Markov decision processes (MDPs) and show how such models can be learned from data. We also consider how they can be controlled given known or learned dynamics. We show that our complex model can be collapsed into a single MDP and solved to find an optimal policy for the combined system. Since the resulting MDP may be very large, we also introduce a planning algorithm that efficiently produces a potentially suboptimal policy. We apply these models to two systems in which a robot uses learning from demonstration to achieve indirect control: playing a computer game using a joystick, and using a hot water dispenser to heat a cup of water. Keywords: Robotics, Task representation, Task learning, Markov decision process
520			\|a United States. Defense Advanced Research Projects Agency (D15AP00104)
520			\|a National Institutes of Health (U.S.) (R01MH109177)
546			\|a en
655	7		\|a Article
773			\|t Autonomous Robots

Representing, learning, and controlling complex object interactions

Similar Items