Q-learning with nearest neighbors

© 2018 Curran Associates Inc.All rights reserved. We consider model-free reinforcement learning for infinite-horizon discounted Markov Decision Processes (MDPs) with a continuous state space and unknown transition kernel, when only a single sample path under an arbitrary policy of the system is avai...

Full description

Bibliographic Details
Main Authors:	Shah, Devavrat (Author), Xie, Qiaomin (Author)
Format:	Article
Language:	English
Published:	2021-11-09T16:08:56Z.
Subjects:	Article
Online Access:	Get fulltext

Internet

Get fulltext

Q-learning with nearest neighbors

Internet

Similar Items