An Improved Deep Reinforcement Learning with Sparse Rewards

碩士 === 國立中山大學 === 電機工程學系研究所 === 107 === In reinforcement learning, how an agent explores in an environment with sparse rewards is a long-standing problem. An improved deep reinforcement learning described in this thesis encourages an agent to explore unvisited environmental states in an environment...

Full description

Bibliographic Details
Main Authors:	Lu-cheng Chi, 紀律呈
Other Authors:	Kao-Shing Hwang
Format:	Others
Language:	zh-TW
Published:	2018
Online Access:	http://ndltd.ncl.edu.tw/handle/eq94pr

id	ndltd-TW-107NSYS5442010
record_format	oai_dc
spelling	ndltd-TW-107NSYS54420102019-05-16T01:40:48Z http://ndltd.ncl.edu.tw/handle/eq94pr An Improved Deep Reinforcement Learning with Sparse Rewards 基於稀疏報酬改良深度加強式學習 Lu-cheng Chi 紀律呈碩士國立中山大學電機工程學系研究所 107 In reinforcement learning, how an agent explores in an environment with sparse rewards is a long-standing problem. An improved deep reinforcement learning described in this thesis encourages an agent to explore unvisited environmental states in an environment with sparse rewards. In deep reinforcement learning, an agent directly uses an image observation from environment as an input to the neural network. However, some neglected observations from environment, such as depth, might provide valuable information. An improved deep reinforcement learning described in this thesis is based on the Actor-Critic algorithm and uses the convolutional neural network as a hetero-encoder between an image input and other observations from environment. In the environment with sparse rewards, we use these neglected observations from environment as a target output of supervised learning and provide an agent denser training signals through supervised learning to bootstrap reinforcement learning. In addition, we use the loss from supervised learning as the feedback for an agent’s exploration behavior in an environment, called the label reward, to encourage an agent to explore unvisited environmental states. Finally, we construct multiple neural networks by Asynchronous Advantage Actor-Critic algorithm and learn the policy with multiple agents. An improved deep reinforcement learning described in this thesis is compared with other deep reinforcement learning in an environment with sparse rewards and achieves better performance. Kao-Shing Hwang 黃國勝 2018 學位論文 ; thesis 43 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立中山大學 === 電機工程學系研究所 === 107 === In reinforcement learning, how an agent explores in an environment with sparse rewards is a long-standing problem. An improved deep reinforcement learning described in this thesis encourages an agent to explore unvisited environmental states in an environment with sparse rewards. In deep reinforcement learning, an agent directly uses an image observation from environment as an input to the neural network. However, some neglected observations from environment, such as depth, might provide valuable information. An improved deep reinforcement learning described in this thesis is based on the Actor-Critic algorithm and uses the convolutional neural network as a hetero-encoder between an image input and other observations from environment. In the environment with sparse rewards, we use these neglected observations from environment as a target output of supervised learning and provide an agent denser training signals through supervised learning to bootstrap reinforcement learning. In addition, we use the loss from supervised learning as the feedback for an agent’s exploration behavior in an environment, called the label reward, to encourage an agent to explore unvisited environmental states. Finally, we construct multiple neural networks by Asynchronous Advantage Actor-Critic algorithm and learn the policy with multiple agents. An improved deep reinforcement learning described in this thesis is compared with other deep reinforcement learning in an environment with sparse rewards and achieves better performance.
author2	Kao-Shing Hwang
author_facet	Kao-Shing Hwang Lu-cheng Chi 紀律呈
author	Lu-cheng Chi 紀律呈
spellingShingle	Lu-cheng Chi 紀律呈 An Improved Deep Reinforcement Learning with Sparse Rewards
author_sort	Lu-cheng Chi
title	An Improved Deep Reinforcement Learning with Sparse Rewards
title_short	An Improved Deep Reinforcement Learning with Sparse Rewards
title_full	An Improved Deep Reinforcement Learning with Sparse Rewards
title_fullStr	An Improved Deep Reinforcement Learning with Sparse Rewards
title_full_unstemmed	An Improved Deep Reinforcement Learning with Sparse Rewards
title_sort	improved deep reinforcement learning with sparse rewards
publishDate	2018
url	http://ndltd.ncl.edu.tw/handle/eq94pr
work_keys_str_mv	AT luchengchi animproveddeepreinforcementlearningwithsparserewards AT jìlǜchéng animproveddeepreinforcementlearningwithsparserewards AT luchengchi jīyúxīshūbàochóugǎiliángshēndùjiāqiángshìxuéxí AT jìlǜchéng jīyúxīshūbàochóugǎiliángshēndùjiāqiángshìxuéxí AT luchengchi improveddeepreinforcementlearningwithsparserewards AT jìlǜchéng improveddeepreinforcementlearningwithsparserewards
_version_	1719178966002040832

An Improved Deep Reinforcement Learning with Sparse Rewards

Similar Items