A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems

博士 === 國立中正大學 === 電機工程所 === 97 === In a cooperative social environment, a reinforcement learning agent not only learns to achieve the goal by trail-and-error, but also facilitates the learning efficiency through the instantaneous shared information. The purpose of this thesis is investigating how th...

Full description

Bibliographic Details
Main Authors: Yu-Jen Chen, 陳昱仁
Other Authors: Kao-Shing Hwang
Format: Others
Language:en_US
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/09344554392677792073
id ndltd-TW-097CCU05442116
record_format oai_dc
spelling ndltd-TW-097CCU054421162016-05-04T04:26:07Z http://ndltd.ncl.edu.tw/handle/09344554392677792073 A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems 基於自我組織決策樹多重代理人之策略分享機制 Yu-Jen Chen 陳昱仁 博士 國立中正大學 電機工程所 97 In a cooperative social environment, a reinforcement learning agent not only learns to achieve the goal by trail-and-error, but also facilitates the learning efficiency through the instantaneous shared information. The purpose of this thesis is investigating how the multi-agents share the information and what information is shared in a real environment. For applying reinforcement learning to a real environment, state partition is an important issue and open problem in reinforcement learning, because it affects the performance of learning significantly. The fundamental approach of Q-learning is actually a table-look-up method on a basis of finite discrete state space. Thus the learning incrementally estimates Q-values of a state based on rewards received from the environment and the previous Q-value estimates. Unfortunately, robots always learn and behave in a continuous perceptual space where the observed perceptions are transformed into or coarsely regarded as temporal-spatial states. Nowadays, there is still no elegant unified way claimed to combine discrete actions with continuous observations or states with optimality in computational time, memory storage, and so on. Therefore, how to accommodate continuous states with a finite discrete set of actions has become an important and intriguing issue in this research area. In this thesis, we proposed an adaptive state partition method for discretizing the state space adaptively and effectively making use of decision trees. Instead of exhaustive search by a defined impurity, the proposed method splits the state space according to the temporal difference generated by the reinforcement learning. Based on the above approach, we also introduced an algorithm to define an action policy from a discrete space to a real valued domain; that is, the proposed method can generate a real-valued action based on one action selected from a discrete set and randomly but slightly disturbed by an associated bias. From the viewpoint of exploration and exploitation, the method searches a better action on the basis of a paradigm action in the solution space with a variation within the biased region. Perusing the applicability of the proposed methods to multi-agent systems, we defined a policy sharing mechanism for agents to share the policy of local areas which have better experience with each agent. Kao-Shing Hwang 黃國勝 2009 學位論文 ; thesis 71 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立中正大學 === 電機工程所 === 97 === In a cooperative social environment, a reinforcement learning agent not only learns to achieve the goal by trail-and-error, but also facilitates the learning efficiency through the instantaneous shared information. The purpose of this thesis is investigating how the multi-agents share the information and what information is shared in a real environment. For applying reinforcement learning to a real environment, state partition is an important issue and open problem in reinforcement learning, because it affects the performance of learning significantly. The fundamental approach of Q-learning is actually a table-look-up method on a basis of finite discrete state space. Thus the learning incrementally estimates Q-values of a state based on rewards received from the environment and the previous Q-value estimates. Unfortunately, robots always learn and behave in a continuous perceptual space where the observed perceptions are transformed into or coarsely regarded as temporal-spatial states. Nowadays, there is still no elegant unified way claimed to combine discrete actions with continuous observations or states with optimality in computational time, memory storage, and so on. Therefore, how to accommodate continuous states with a finite discrete set of actions has become an important and intriguing issue in this research area. In this thesis, we proposed an adaptive state partition method for discretizing the state space adaptively and effectively making use of decision trees. Instead of exhaustive search by a defined impurity, the proposed method splits the state space according to the temporal difference generated by the reinforcement learning. Based on the above approach, we also introduced an algorithm to define an action policy from a discrete space to a real valued domain; that is, the proposed method can generate a real-valued action based on one action selected from a discrete set and randomly but slightly disturbed by an associated bias. From the viewpoint of exploration and exploitation, the method searches a better action on the basis of a paradigm action in the solution space with a variation within the biased region. Perusing the applicability of the proposed methods to multi-agent systems, we defined a policy sharing mechanism for agents to share the policy of local areas which have better experience with each agent.
author2 Kao-Shing Hwang
author_facet Kao-Shing Hwang
Yu-Jen Chen
陳昱仁
author Yu-Jen Chen
陳昱仁
spellingShingle Yu-Jen Chen
陳昱仁
A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems
author_sort Yu-Jen Chen
title A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems
title_short A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems
title_full A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems
title_fullStr A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems
title_full_unstemmed A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems
title_sort self-organizing decision tree approach to policy sharing of multi-agent systems
publishDate 2009
url http://ndltd.ncl.edu.tw/handle/09344554392677792073
work_keys_str_mv AT yujenchen aselforganizingdecisiontreeapproachtopolicysharingofmultiagentsystems
AT chényùrén aselforganizingdecisiontreeapproachtopolicysharingofmultiagentsystems
AT yujenchen jīyúzìwǒzǔzhījuécèshùduōzhòngdàilǐrénzhīcèlüèfēnxiǎngjīzhì
AT chényùrén jīyúzìwǒzǔzhījuécèshùduōzhòngdàilǐrénzhīcèlüèfēnxiǎngjīzhì
AT yujenchen selforganizingdecisiontreeapproachtopolicysharingofmultiagentsystems
AT chényùrén selforganizingdecisiontreeapproachtopolicysharingofmultiagentsystems
_version_ 1718258134176235520