Stick-breaking policy learning in Dec-POMDPs

Expectation maximization (EM) has recently been shown to be an efficient algorithm for learning finite-state controllers (FSCs) in large decentralized POMDPs (Dec-POMDPs). However, current methods use fixed-size FSCs and often converge to maxima that are far from the optimal value. This paper repres...

Full description

Bibliographic Details
Main Authors: Amato, Christopher (Author), Liao, Xuejun (Author), Carin, Lawrence (Author), Liu, Miao (Contributor), How, Jonathan P (Contributor)
Other Authors: Massachusetts Institute of Technology. Department of Aeronautics and Astronautics (Contributor), Massachusetts Institute of Technology. Laboratory for Information and Decision Systems (Contributor)
Format: Article
Language:English
Published: International Joint Conferences on Artificial Intelligence, Inc., 2016-10-21T19:07:30Z.
Subjects:
Online Access:Get fulltext