Projected equation and aggregation-based approximate dynamic programming methods for Tetris

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011. === Cataloged from PDF version of thesis. === Includes bibliographical references (p. 65-67). === In this thesis, we survey approximate dynamic programming (ADP) methods and test the met...

Full description

Bibliographic Details
Main Author:	Hwang, Daw-sen
Other Authors:	Dimitri P. Bertsekas.
Format:	Others
Language:	English
Published:	Massachusetts Institute of Technology 2011
Subjects:	Electrical Engineering and Computer Science.
Online Access:	http://hdl.handle.net/1721.1/66033

Description
Summary:	Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011. === Cataloged from PDF version of thesis. === Includes bibliographical references (p. 65-67). === In this thesis, we survey approximate dynamic programming (ADP) methods and test the methods with the game of Tetris. We focus on ADP methods where the cost-to- go function J is approximated with [phi]r, where [phi] is some matrix and r is a vector with relatively low dimension. There are two major categories of methods: projected equation methods and aggregation methods. In projected equation methods, the cost-to-go function approximation [phi]r is updated by simulation using one of several policy-updated algorithms such as LSTD([lambda]) [BB96], and LSPE(A) [B196]. Projected equation methods generally may not converge. We define a pseudometric of policies and view the oscillations of policies in Tetris. Aggregation methods are based on a model approximation approach. The original problem is reduced to an aggregate problem with significantly fewer states. The weight vector r is the cost-to-go function of the aggregate problem and [phi] is the matrix of aggregation probabilities. In aggregation methods, the vector r converges to the optimal cost-to-go function of the aggregate problem. In this thesis, we implement aggregation methods for Tetris, and compare the performance of projected equation methods and aggregation methods. === by Daw-sen Hwang. === S.M.

Projected equation and aggregation-based approximate dynamic programming methods for Tetris

Similar Items