A unified framework for temporal difference methods
We propose a unified framework for a broad class of methods to solve projected equations that approximate the solution of a high-dimensional fixed point problem within a subspace S spanned by a small number of basis functions or features. These methods originated in approximate dynamic programming (...
Main Author: | |
---|---|
Other Authors: | , |
Format: | Article |
Language: | English |
Published: |
Institute of Electrical and Electronics Engineers,
2010-10-01T18:17:46Z.
|
Subjects: | |
Online Access: | Get fulltext |
Summary: | We propose a unified framework for a broad class of methods to solve projected equations that approximate the solution of a high-dimensional fixed point problem within a subspace S spanned by a small number of basis functions or features. These methods originated in approximate dynamic programming (DP), where they are collectively known as temporal difference (TD) methods. Our framework is based on a connection with projection methods for monotone variational inequalities, which involve alternative representations of the subspace S (feature scaling). Our methods admit simulation-based implementations, and even when specialized to DP problems, include extensions/new versions of the standard TD algorithms, which offer some special implementation advantages and reduced overhead. National Science Foundation (U.S.) (NSF grant ECCS-0801549) |
---|