Policy Iteration for Continuous-Time Average Reward Markov Decision Processes in Polish Spaces
We study the policy iteration algorithm (PIA) for continuous-time jump Markov decision processes in general state and action spaces. The corresponding transition rates are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. The criterion that we are concerned with...
Main Authors: | Quanxin Zhu, Xinsong Yang, Chuangxia Huang |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi Limited
2009-01-01
|
Series: | Abstract and Applied Analysis |
Online Access: | http://dx.doi.org/10.1155/2009/103723 |
Similar Items
-
Regret-based Reward Elicitation for Markov Decision Processes
by: Kevin, Regan
Published: (2014) -
Elicitation and planning in Markov decision processes with unknown rewards
by: Alizadeh, Pegah
Published: (2016) -
Acceleration of Iterative Methods for Markov Decision Processes
by: Shlakhter, Oleksandr
Published: (2010) -
Acceleration of Iterative Methods for Markov Decision Processes
by: Shlakhter, Oleksandr
Published: (2010) -
The Average Shadowing Property in Continuous Iterated Function Systems
by: M. Fatehi Nia
Published: (2015-09-01)