A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems

博士 === 國立中正大學 === 電機工程所 === 97 === In a cooperative social environment, a reinforcement learning agent not only learns to achieve the goal by trail-and-error, but also facilitates the learning efficiency through the instantaneous shared information. The purpose of this thesis is investigating how th...

Full description

Bibliographic Details
Main Authors:	Yu-Jen Chen, 陳昱仁
Other Authors:	Kao-Shing Hwang
Format:	Others
Language:	en_US
Published:	2009
Online Access:	http://ndltd.ncl.edu.tw/handle/09344554392677792073

id	ndltd-TW-097CCU05442116
record_format	oai_dc
spelling	ndltd-TW-097CCU054421162016-05-04T04:26:07Z http://ndltd.ncl.edu.tw/handle/09344554392677792073 A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems 基於自我組織決策樹多重代理人之策略分享機制 Yu-Jen Chen 陳昱仁博士國立中正大學電機工程所 97 In a cooperative social environment, a reinforcement learning agent not only learns to achieve the goal by trail-and-error, but also facilitates the learning efficiency through the instantaneous shared information. The purpose of this thesis is investigating how the multi-agents share the information and what information is shared in a real environment. For applying reinforcement learning to a real environment, state partition is an important issue and open problem in reinforcement learning, because it affects the performance of learning significantly. The fundamental approach of Q-learning is actually a table-look-up method on a basis of finite discrete state space. Thus the learning incrementally estimates Q-values of a state based on rewards received from the environment and the previous Q-value estimates. Unfortunately, robots always learn and behave in a continuous perceptual space where the observed perceptions are transformed into or coarsely regarded as temporal-spatial states. Nowadays, there is still no elegant unified way claimed to combine discrete actions with continuous observations or states with optimality in computational time, memory storage, and so on. Therefore, how to accommodate continuous states with a finite discrete set of actions has become an important and intriguing issue in this research area. In this thesis, we proposed an adaptive state partition method for discretizing the state space adaptively and effectively making use of decision trees. Instead of exhaustive search by a defined impurity, the proposed method splits the state space according to the temporal difference generated by the reinforcement learning. Based on the above approach, we also introduced an algorithm to define an action policy from a discrete space to a real valued domain; that is, the proposed method can generate a real-valued action based on one action selected from a discrete set and randomly but slightly disturbed by an associated bias. From the viewpoint of exploration and exploitation, the method searches a better action on the basis of a paradigm action in the solution space with a variation within the biased region. Perusing the applicability of the proposed methods to multi-agent systems, we defined a policy sharing mechanism for agents to share the policy of local areas which have better experience with each agent. Kao-Shing Hwang 黃國勝 2009 學位論文 ; thesis 71 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	博士 === 國立中正大學 === 電機工程所 === 97 === In a cooperative social environment, a reinforcement learning agent not only learns to achieve the goal by trail-and-error, but also facilitates the learning efficiency through the instantaneous shared information. The purpose of this thesis is investigating how the multi-agents share the information and what information is shared in a real environment. For applying reinforcement learning to a real environment, state partition is an important issue and open problem in reinforcement learning, because it affects the performance of learning significantly. The fundamental approach of Q-learning is actually a table-look-up method on a basis of finite discrete state space. Thus the learning incrementally estimates Q-values of a state based on rewards received from the environment and the previous Q-value estimates. Unfortunately, robots always learn and behave in a continuous perceptual space where the observed perceptions are transformed into or coarsely regarded as temporal-spatial states. Nowadays, there is still no elegant unified way claimed to combine discrete actions with continuous observations or states with optimality in computational time, memory storage, and so on. Therefore, how to accommodate continuous states with a finite discrete set of actions has become an important and intriguing issue in this research area. In this thesis, we proposed an adaptive state partition method for discretizing the state space adaptively and effectively making use of decision trees. Instead of exhaustive search by a defined impurity, the proposed method splits the state space according to the temporal difference generated by the reinforcement learning. Based on the above approach, we also introduced an algorithm to define an action policy from a discrete space to a real valued domain; that is, the proposed method can generate a real-valued action based on one action selected from a discrete set and randomly but slightly disturbed by an associated bias. From the viewpoint of exploration and exploitation, the method searches a better action on the basis of a paradigm action in the solution space with a variation within the biased region. Perusing the applicability of the proposed methods to multi-agent systems, we defined a policy sharing mechanism for agents to share the policy of local areas which have better experience with each agent.
author2	Kao-Shing Hwang
author_facet	Kao-Shing Hwang Yu-Jen Chen 陳昱仁
author	Yu-Jen Chen 陳昱仁
spellingShingle	Yu-Jen Chen 陳昱仁 A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems
author_sort	Yu-Jen Chen
title	A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems
title_short	A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems
title_full	A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems
title_fullStr	A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems
title_full_unstemmed	A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems
title_sort	self-organizing decision tree approach to policy sharing of multi-agent systems
publishDate	2009
url	http://ndltd.ncl.edu.tw/handle/09344554392677792073
work_keys_str_mv	AT yujenchen aselforganizingdecisiontreeapproachtopolicysharingofmultiagentsystems AT chényùrén aselforganizingdecisiontreeapproachtopolicysharingofmultiagentsystems AT yujenchen jīyúzìwǒzǔzhījuécèshùduōzhòngdàilǐrénzhīcèlüèfēnxiǎngjīzhì AT chényùrén jīyúzìwǒzǔzhījuécèshùduōzhòngdàilǐrénzhīcèlüèfēnxiǎngjīzhì AT yujenchen selforganizingdecisiontreeapproachtopolicysharingofmultiagentsystems AT chényùrén selforganizingdecisiontreeapproachtopolicysharingofmultiagentsystems
_version_	1718258134176235520

A Self-Organizing Decision Tree Approach to Policy Sharing of Multi-agent Systems

Similar Items