Symblicit algorithms for optimal strategy synthesis in monotonic Markov decision processes
When treating Markov decision processes (MDPs) with large state spaces, using explicit representations quickly becomes unfeasible. Lately, Wimmer et al. have proposed a so-called symblicit algorithm for the synthesis of optimal strategies in MDPs, in the quantitative setting of expected mean-payoff....
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Open Publishing Association
2014-07-01
|
Series: | Electronic Proceedings in Theoretical Computer Science |
Online Access: | http://arxiv.org/pdf/1407.5396v1 |
id |
doaj-7532704daa904753bd09a4b80fc8c6ab |
---|---|
record_format |
Article |
spelling |
doaj-7532704daa904753bd09a4b80fc8c6ab2020-11-25T01:39:48ZengOpen Publishing AssociationElectronic Proceedings in Theoretical Computer Science2075-21802014-07-01157Proc. SYNT 2014516710.4204/EPTCS.157.8:1bSymblicit algorithms for optimal strategy synthesis in monotonic Markov decision processesAaron Bohy0Véronique Bruyère1Jean-François Raskin2 Université de Mons Université de Mons Université Libre de Bruxelles When treating Markov decision processes (MDPs) with large state spaces, using explicit representations quickly becomes unfeasible. Lately, Wimmer et al. have proposed a so-called symblicit algorithm for the synthesis of optimal strategies in MDPs, in the quantitative setting of expected mean-payoff. This algorithm, based on the strategy iteration algorithm of Howard and Veinott, efficiently combines symbolic and explicit data structures, and uses binary decision diagrams as symbolic representation. The aim of this paper is to show that the new data structure of pseudo-antichains (an extension of antichains) provides another interesting alternative, especially for the class of monotonic MDPs. We design efficient pseudo-antichain based symblicit algorithms (with open source implementations) for two quantitative settings: the expected mean-payoff and the stochastic shortest path. For two practical applications coming from automated planning and LTL synthesis, we report promising experimental results w.r.t. both the run time and the memory consumption.http://arxiv.org/pdf/1407.5396v1 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Aaron Bohy Véronique Bruyère Jean-François Raskin |
spellingShingle |
Aaron Bohy Véronique Bruyère Jean-François Raskin Symblicit algorithms for optimal strategy synthesis in monotonic Markov decision processes Electronic Proceedings in Theoretical Computer Science |
author_facet |
Aaron Bohy Véronique Bruyère Jean-François Raskin |
author_sort |
Aaron Bohy |
title |
Symblicit algorithms for optimal strategy synthesis in monotonic Markov decision processes |
title_short |
Symblicit algorithms for optimal strategy synthesis in monotonic Markov decision processes |
title_full |
Symblicit algorithms for optimal strategy synthesis in monotonic Markov decision processes |
title_fullStr |
Symblicit algorithms for optimal strategy synthesis in monotonic Markov decision processes |
title_full_unstemmed |
Symblicit algorithms for optimal strategy synthesis in monotonic Markov decision processes |
title_sort |
symblicit algorithms for optimal strategy synthesis in monotonic markov decision processes |
publisher |
Open Publishing Association |
series |
Electronic Proceedings in Theoretical Computer Science |
issn |
2075-2180 |
publishDate |
2014-07-01 |
description |
When treating Markov decision processes (MDPs) with large state spaces, using explicit representations quickly becomes unfeasible. Lately, Wimmer et al. have proposed a so-called symblicit algorithm for the synthesis of optimal strategies in MDPs, in the quantitative setting of expected mean-payoff. This algorithm, based on the strategy iteration algorithm of Howard and Veinott, efficiently combines symbolic and explicit data structures, and uses binary decision diagrams as symbolic representation. The aim of this paper is to show that the new data structure of pseudo-antichains (an extension of antichains) provides another interesting alternative, especially for the class of monotonic MDPs. We design efficient pseudo-antichain based symblicit algorithms (with open source implementations) for two quantitative settings: the expected mean-payoff and the stochastic shortest path. For two practical applications coming from automated planning and LTL synthesis, we report promising experimental results w.r.t. both the run time and the memory consumption. |
url |
http://arxiv.org/pdf/1407.5396v1 |
work_keys_str_mv |
AT aaronbohy symblicitalgorithmsforoptimalstrategysynthesisinmonotonicmarkovdecisionprocesses AT veroniquebruyere symblicitalgorithmsforoptimalstrategysynthesisinmonotonicmarkovdecisionprocesses AT jeanfrancoisraskin symblicitalgorithmsforoptimalstrategysynthesisinmonotonicmarkovdecisionprocesses |
_version_ |
1725049084822159360 |