Policy Iteration for Continuous-Time Average Reward Markov Decision Processes in Polish Spaces
We study the policy iteration algorithm (PIA) for continuous-time jump Markov decision processes in general state and action spaces. The corresponding transition rates are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. The criterion that we are concerned with...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi Limited
2009-01-01
|
Series: | Abstract and Applied Analysis |
Online Access: | http://dx.doi.org/10.1155/2009/103723 |