HMCTS-OP: Hierarchical MCTS Based Online Planning in the Asymmetric Adversarial Environment

The Monte Carlo Tree Search (MCTS) has demonstrated excellent performance in solving many planning problems. However, the state space and the branching factors are huge, and the planning horizon is long in many practical applications, especially in the adversarial environment. It is computationally...

Full description

Bibliographic Details
Main Authors: Lina Lu, Wanpeng Zhang, Xueqiang Gu, Xiang Ji, Jing Chen
Format: Article
Language:English
Published: MDPI AG 2020-05-01
Series:Symmetry
Subjects:
Online Access:https://www.mdpi.com/2073-8994/12/5/719
id doaj-b92ca079c8bf42c796a9e7cefbe0563d
record_format Article
spelling doaj-b92ca079c8bf42c796a9e7cefbe0563d2020-11-25T03:52:20ZengMDPI AGSymmetry2073-89942020-05-011271971910.3390/sym12050719HMCTS-OP: Hierarchical MCTS Based Online Planning in the Asymmetric Adversarial EnvironmentLina Lu0Wanpeng Zhang1Xueqiang Gu2Xiang Ji3Jing Chen4College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, Hunan, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, Hunan, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, Hunan, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, Hunan, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, Hunan, ChinaThe Monte Carlo Tree Search (MCTS) has demonstrated excellent performance in solving many planning problems. However, the state space and the branching factors are huge, and the planning horizon is long in many practical applications, especially in the adversarial environment. It is computationally expensive to cover a sufficient number of rewarded states that are far away from the root in the flat non-hierarchical MCTS. Therefore, the flat non-hierarchical MCTS is inefficient for dealing with planning problems with a long planning horizon, huge state space, and branching factors. In this work, we propose a novel hierarchical MCTS-based online planning method named the HMCTS-OP to tackle this issue. The HMCTS-OP integrates the MAXQ-based task hierarchies and the hierarchical MCTS algorithms into the online planning framework. Specifically, the MAXQ-based task hierarchies reduce the search space and guide the search process. Therefore, the computational complexity is significantly reduced. Moreover, the reduction in the computational complexity enables the MCTS to perform a deeper search to find better action in a limited time. We evaluate the performance of the HMCTS-OP in the domain of online planning in the asymmetric adversarial environment. The experiment results show that the HMCTS-OP outperforms other online planning methods in this domain.https://www.mdpi.com/2073-8994/12/5/719HMCTSonline planningMAXQasymmetric adversarial environment
collection DOAJ
language English
format Article
sources DOAJ
author Lina Lu
Wanpeng Zhang
Xueqiang Gu
Xiang Ji
Jing Chen
spellingShingle Lina Lu
Wanpeng Zhang
Xueqiang Gu
Xiang Ji
Jing Chen
HMCTS-OP: Hierarchical MCTS Based Online Planning in the Asymmetric Adversarial Environment
Symmetry
HMCTS
online planning
MAXQ
asymmetric adversarial environment
author_facet Lina Lu
Wanpeng Zhang
Xueqiang Gu
Xiang Ji
Jing Chen
author_sort Lina Lu
title HMCTS-OP: Hierarchical MCTS Based Online Planning in the Asymmetric Adversarial Environment
title_short HMCTS-OP: Hierarchical MCTS Based Online Planning in the Asymmetric Adversarial Environment
title_full HMCTS-OP: Hierarchical MCTS Based Online Planning in the Asymmetric Adversarial Environment
title_fullStr HMCTS-OP: Hierarchical MCTS Based Online Planning in the Asymmetric Adversarial Environment
title_full_unstemmed HMCTS-OP: Hierarchical MCTS Based Online Planning in the Asymmetric Adversarial Environment
title_sort hmcts-op: hierarchical mcts based online planning in the asymmetric adversarial environment
publisher MDPI AG
series Symmetry
issn 2073-8994
publishDate 2020-05-01
description The Monte Carlo Tree Search (MCTS) has demonstrated excellent performance in solving many planning problems. However, the state space and the branching factors are huge, and the planning horizon is long in many practical applications, especially in the adversarial environment. It is computationally expensive to cover a sufficient number of rewarded states that are far away from the root in the flat non-hierarchical MCTS. Therefore, the flat non-hierarchical MCTS is inefficient for dealing with planning problems with a long planning horizon, huge state space, and branching factors. In this work, we propose a novel hierarchical MCTS-based online planning method named the HMCTS-OP to tackle this issue. The HMCTS-OP integrates the MAXQ-based task hierarchies and the hierarchical MCTS algorithms into the online planning framework. Specifically, the MAXQ-based task hierarchies reduce the search space and guide the search process. Therefore, the computational complexity is significantly reduced. Moreover, the reduction in the computational complexity enables the MCTS to perform a deeper search to find better action in a limited time. We evaluate the performance of the HMCTS-OP in the domain of online planning in the asymmetric adversarial environment. The experiment results show that the HMCTS-OP outperforms other online planning methods in this domain.
topic HMCTS
online planning
MAXQ
asymmetric adversarial environment
url https://www.mdpi.com/2073-8994/12/5/719
work_keys_str_mv AT linalu hmctsophierarchicalmctsbasedonlineplanningintheasymmetricadversarialenvironment
AT wanpengzhang hmctsophierarchicalmctsbasedonlineplanningintheasymmetricadversarialenvironment
AT xueqianggu hmctsophierarchicalmctsbasedonlineplanningintheasymmetricadversarialenvironment
AT xiangji hmctsophierarchicalmctsbasedonlineplanningintheasymmetricadversarialenvironment
AT jingchen hmctsophierarchicalmctsbasedonlineplanningintheasymmetricadversarialenvironment
_version_ 1724482694156386304