Policy Iteration for Continuous-Time Average Reward Markov Decision Processes in Polish Spaces

We study the policy iteration algorithm (PIA) for continuous-time jump Markov decision processes in general state and action spaces. The corresponding transition rates are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. The criterion that we are concerned with...

Full description

Bibliographic Details
Main Authors: Quanxin Zhu, Xinsong Yang, Chuangxia Huang
Format: Article
Language:English
Published: Hindawi Limited 2009-01-01
Series:Abstract and Applied Analysis
Online Access:http://dx.doi.org/10.1155/2009/103723