Provably efficient learning with typed parametric models

To quickly achieve good performance, reinforcement-learning algorithms for acting in large continuous-valued domains must use a representation that is both sufficiently powerful to capture important domain characteristics, and yet simultaneously allows generalization, or sharing, among experiences....

Full description

Bibliographic Details
Main Authors:	Brunskill, Emma (Contributor), Leffler, Bethany R. (Author), Li, Lihong (Author), Littman, Michael L. (Author), Roy, Nicholas (Contributor)
Other Authors:	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory (Contributor), Massachusetts Institute of Technology. Department of Aeronautics and Astronautics (Contributor)
Format:	Article
Language:	English
Published:	Journal of Machine Learning Research, 2010-11-29T17:59:03Z.
Subjects:	Article
Online Access:	Get fulltext


LEADER	02311 am a22002533u 4500
001	60042
042			\|a dc
100	1	0	\|a Brunskill, Emma \|e author
100	1	0	\|a Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory \|e contributor
100	1	0	\|a Massachusetts Institute of Technology. Department of Aeronautics and Astronautics \|e contributor
100	1	0	\|a Roy, Nicholas \|e contributor
100	1	0	\|a Roy, Nicholas \|e contributor
100	1	0	\|a Brunskill, Emma \|e contributor
700	1	0	\|a Leffler, Bethany R. \|e author
700	1	0	\|a Li, Lihong \|e author
700	1	0	\|a Littman, Michael L. \|e author
700	1	0	\|a Roy, Nicholas \|e author
245	0	0	\|a Provably efficient learning with typed parametric models
260			\|b Journal of Machine Learning Research, \|c 2010-11-29T17:59:03Z.
856			\|z Get fulltext \|u http://hdl.handle.net/1721.1/60042
520			\|a To quickly achieve good performance, reinforcement-learning algorithms for acting in large continuous-valued domains must use a representation that is both sufficiently powerful to capture important domain characteristics, and yet simultaneously allows generalization, or sharing, among experiences. Our algorithm balances this tradeoff by using a stochastic, switching, parametric dynamics representation. We argue that this model characterizes a number of significant, real-world domains, such as robot navigati on across varying terrain. We prove that this representational assumption allows our algorithm to be probably approximately correct with a sample complexity that scales polynomially with all problem-specific quantities including the state-space dimension. We also explicitly incorporate the error introduced by approximate planning in our sample complexity bounds, in contrast to prior Probably Approximately Correct (PAC) Markov Decision Processes (MDP) approaches, which typically assume the estimated MDP can be solved exactly. Our experimental results on constructing plans for driving to work using real car trajectory data, as well as a small robot experiment on navigating varying terrain, demonstrate that our dynamics representation enables us to capture real-world dynamics in a sufficient manner to produce good performance.
546			\|a en_US
655	7		\|a Article
773			\|t Journal of Machine Learning Research

Provably efficient learning with typed parametric models

Similar Items