Q-learning with Continuous Action Value in Multi-agent Cooperation

碩士 === 國立中正大學 === 電機工程所 === 94 === In this thesis, we propose a Q-learning with continuous action space and extend this algorithm to a multi-agent system. We implement this algorithm in a task that there are two robots taking action independently and both are connected with a straight bar. They must...

Full description

Bibliographic Details
Main Authors:	Yu-Hong Lin, 林咩
Other Authors:	Kao-Shing Hwang
Format:	Others
Language:	en_US
Online Access:	http://ndltd.ncl.edu.tw/handle/36513682200450671073

id	ndltd-TW-094CCU05442040
record_format	oai_dc
spelling	ndltd-TW-094CCU054420402015-10-13T10:45:18Z http://ndltd.ncl.edu.tw/handle/36513682200450671073 Q-learning with Continuous Action Value in Multi-agent Cooperation 具連續行為的Q-learning應用於多重代理人之合作 Yu-Hong Lin 林咩碩士國立中正大學電機工程所 94 In this thesis, we propose a Q-learning with continuous action space and extend this algorithm to a multi-agent system. We implement this algorithm in a task that there are two robots taking action independently and both are connected with a straight bar. They must cooperate to move to the goal and avoid the obstacles in the environment. Conventional Q-learning needs a pre-defined and discrete state space, so it will have finite states and actions. But it is not practical because in real world the states of the environment and the actions are both continuous, so when we using Q-learning to demonstrate the action in the world, we can’t precisely identify the variances of the different situation in the same state. We use the concept of SRV (Stochastic Real-Valued Unit) to train the action in each state, so the result action will be continuous. It will make the simulation that more close to the real world; also it can fix the pre-defined action space in Q-learning, result in a more ideal learning outcome. Kao-Shing Hwang 黃國勝學位論文 ; thesis 46 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立中正大學 === 電機工程所 === 94 === In this thesis, we propose a Q-learning with continuous action space and extend this algorithm to a multi-agent system. We implement this algorithm in a task that there are two robots taking action independently and both are connected with a straight bar. They must cooperate to move to the goal and avoid the obstacles in the environment. Conventional Q-learning needs a pre-defined and discrete state space, so it will have finite states and actions. But it is not practical because in real world the states of the environment and the actions are both continuous, so when we using Q-learning to demonstrate the action in the world, we can’t precisely identify the variances of the different situation in the same state. We use the concept of SRV (Stochastic Real-Valued Unit) to train the action in each state, so the result action will be continuous. It will make the simulation that more close to the real world; also it can fix the pre-defined action space in Q-learning, result in a more ideal learning outcome.
author2	Kao-Shing Hwang
author_facet	Kao-Shing Hwang Yu-Hong Lin 林咩
author	Yu-Hong Lin 林咩
spellingShingle	Yu-Hong Lin 林咩 Q-learning with Continuous Action Value in Multi-agent Cooperation
author_sort	Yu-Hong Lin
title	Q-learning with Continuous Action Value in Multi-agent Cooperation
title_short	Q-learning with Continuous Action Value in Multi-agent Cooperation
title_full	Q-learning with Continuous Action Value in Multi-agent Cooperation
title_fullStr	Q-learning with Continuous Action Value in Multi-agent Cooperation
title_full_unstemmed	Q-learning with Continuous Action Value in Multi-agent Cooperation
title_sort	q-learning with continuous action value in multi-agent cooperation
url	http://ndltd.ncl.edu.tw/handle/36513682200450671073
work_keys_str_mv	AT yuhonglin qlearningwithcontinuousactionvalueinmultiagentcooperation AT línmiē qlearningwithcontinuousactionvalueinmultiagentcooperation AT yuhonglin jùliánxùxíngwèideqlearningyīngyòngyúduōzhòngdàilǐrénzhīhézuò AT línmiē jùliánxùxíngwèideqlearningyīngyòngyúduōzhòngdàilǐrénzhīhézuò
_version_	1716833108120043520

Q-learning with Continuous Action Value in Multi-agent Cooperation

Similar Items