Pathologies of Temporal Difference Methods in Approximate Dynamic Programming

Approximate policy iteration methods based on temporal differences are popular in practice, and have been tested extensively, dating to the early nineties, but the associated convergence behavior is complex, and not well understood at present. An important question is whether the policy iteration pr...

Full description

Bibliographic Details
Main Author: Bertsekas, Dimitri P. (Contributor)
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Contributor)
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers, 2011-06-21T19:35:53Z.
Subjects:
Online Access:Get fulltext