A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems
博士 === 國立中正大學 === 電機工程所 === 97 === In a cooperative social environment, a reinforcement learning agent not only learns to achieve the goal by trail-and-error, but also facilitates the learning efficiency through the instantaneous shared information. The purpose of this thesis is investigating how th...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2009
|
Online Access: | http://ndltd.ncl.edu.tw/handle/09344554392677792073 |
id |
ndltd-TW-097CCU05442116 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-097CCU054421162016-05-04T04:26:07Z http://ndltd.ncl.edu.tw/handle/09344554392677792073 A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems 基於自我組織決策樹多重代理人之策略分享機制 Yu-Jen Chen 陳昱仁 博士 國立中正大學 電機工程所 97 In a cooperative social environment, a reinforcement learning agent not only learns to achieve the goal by trail-and-error, but also facilitates the learning efficiency through the instantaneous shared information. The purpose of this thesis is investigating how the multi-agents share the information and what information is shared in a real environment. For applying reinforcement learning to a real environment, state partition is an important issue and open problem in reinforcement learning, because it affects the performance of learning significantly. The fundamental approach of Q-learning is actually a table-look-up method on a basis of finite discrete state space. Thus the learning incrementally estimates Q-values of a state based on rewards received from the environment and the previous Q-value estimates. Unfortunately, robots always learn and behave in a continuous perceptual space where the observed perceptions are transformed into or coarsely regarded as temporal-spatial states. Nowadays, there is still no elegant unified way claimed to combine discrete actions with continuous observations or states with optimality in computational time, memory storage, and so on. Therefore, how to accommodate continuous states with a finite discrete set of actions has become an important and intriguing issue in this research area. In this thesis, we proposed an adaptive state partition method for discretizing the state space adaptively and effectively making use of decision trees. Instead of exhaustive search by a defined impurity, the proposed method splits the state space according to the temporal difference generated by the reinforcement learning. Based on the above approach, we also introduced an algorithm to define an action policy from a discrete space to a real valued domain; that is, the proposed method can generate a real-valued action based on one action selected from a discrete set and randomly but slightly disturbed by an associated bias. From the viewpoint of exploration and exploitation, the method searches a better action on the basis of a paradigm action in the solution space with a variation within the biased region. Perusing the applicability of the proposed methods to multi-agent systems, we defined a policy sharing mechanism for agents to share the policy of local areas which have better experience with each agent. Kao-Shing Hwang 黃國勝 2009 學位論文 ; thesis 71 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
博士 === 國立中正大學 === 電機工程所 === 97 === In a cooperative social environment, a reinforcement learning agent not only learns to achieve the goal by trail-and-error, but also facilitates the learning efficiency through the instantaneous shared information. The purpose of this thesis is investigating how the multi-agents share the information and what information is shared in a real environment. For applying reinforcement learning to a real environment, state partition is an important issue and open problem in reinforcement learning, because it affects the performance of learning significantly. The fundamental approach of Q-learning is actually a table-look-up method on a basis of finite discrete state space. Thus the learning incrementally estimates Q-values of a state based on rewards received from the environment and the previous Q-value estimates. Unfortunately, robots always learn and behave in a continuous perceptual space where the observed perceptions are transformed into or coarsely regarded as temporal-spatial states. Nowadays, there is still no elegant unified way claimed to combine discrete actions with continuous observations or states with optimality in computational time, memory storage, and so on. Therefore, how to accommodate continuous states with a finite discrete set of actions has become an important and intriguing issue in this research area.
In this thesis, we proposed an adaptive state partition method for discretizing the state space adaptively and effectively making use of decision trees. Instead of exhaustive search by a defined impurity, the proposed method splits the state space according to the temporal difference generated by the reinforcement learning. Based on the above approach, we also introduced an algorithm to define an action policy from a discrete space to a real valued domain; that is, the proposed method can generate a real-valued action based on one action selected from a discrete set and randomly but slightly disturbed by an associated bias. From the viewpoint of exploration and exploitation, the method searches a better action on the basis of a paradigm action in the solution space with a variation within the biased region. Perusing the applicability of the proposed methods to multi-agent systems, we defined a policy sharing mechanism for agents to share the policy of local areas which have better experience with each agent.
|
author2 |
Kao-Shing Hwang |
author_facet |
Kao-Shing Hwang Yu-Jen Chen 陳昱仁 |
author |
Yu-Jen Chen 陳昱仁 |
spellingShingle |
Yu-Jen Chen 陳昱仁 A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems |
author_sort |
Yu-Jen Chen |
title |
A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems |
title_short |
A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems |
title_full |
A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems |
title_fullStr |
A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems |
title_full_unstemmed |
A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems |
title_sort |
self-organizing decision tree approach to policy sharing of multi-agent systems |
publishDate |
2009 |
url |
http://ndltd.ncl.edu.tw/handle/09344554392677792073 |
work_keys_str_mv |
AT yujenchen aselforganizingdecisiontreeapproachtopolicysharingofmultiagentsystems AT chényùrén aselforganizingdecisiontreeapproachtopolicysharingofmultiagentsystems AT yujenchen jīyúzìwǒzǔzhījuécèshùduōzhòngdàilǐrénzhīcèlüèfēnxiǎngjīzhì AT chényùrén jīyúzìwǒzǔzhījuécèshùduōzhòngdàilǐrénzhīcèlüèfēnxiǎngjīzhì AT yujenchen selforganizingdecisiontreeapproachtopolicysharingofmultiagentsystems AT chényùrén selforganizingdecisiontreeapproachtopolicysharingofmultiagentsystems |
_version_ |
1718258134176235520 |