Stick-breaking policy learning in Dec-POMDPs

Expectation maximization (EM) has recently been shown to be an efficient algorithm for learning finite-state controllers (FSCs) in large decentralized POMDPs (Dec-POMDPs). However, current methods use fixed-size FSCs and often converge to maxima that are far from the optimal value. This paper repres...

Full description

Bibliographic Details
Main Authors:	Amato, Christopher (Author), Liao, Xuejun (Author), Carin, Lawrence (Author), Liu, Miao (Contributor), How, Jonathan P (Contributor)
Other Authors:	Massachusetts Institute of Technology. Department of Aeronautics and Astronautics (Contributor), Massachusetts Institute of Technology. Laboratory for Information and Decision Systems (Contributor)
Format:	Article
Language:	English
Published:	International Joint Conferences on Artificial Intelligence, Inc., 2016-10-21T19:07:30Z.
Subjects:	Article
Online Access:	Get fulltext

Internet

Get fulltext

Stick-breaking policy learning in Dec-POMDPs

Internet

Similar Items