Action Segmentation and Learning by Inverse Reinforcement Learning
碩士 === 國立中山大學 === 電機工程學系研究所 === 104 === Reinforcement learning allows agents to learn behaviors through trial and error. However, as the level of difficulty increases, the reward function of the mission also becomes harder to be defined. By combining the concepts of Adaboost classifier and Upper C...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2015
|
Online Access: | http://ndltd.ncl.edu.tw/handle/24130256006959006664 |
id |
ndltd-TW-104NSYS5442021 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-104NSYS54420212017-07-30T04:41:11Z http://ndltd.ncl.edu.tw/handle/24130256006959006664 Action Segmentation and Learning by Inverse Reinforcement Learning 透過反加強式學習模仿作行為分段及學習 Hsuan-yi Chiang 江炫儀 碩士 國立中山大學 電機工程學系研究所 104 Reinforcement learning allows agents to learn behaviors through trial and error. However, as the level of difficulty increases, the reward function of the mission also becomes harder to be defined. By combining the concepts of Adaboost classifier and Upper Confidence Bounds (UCB), a method based on inverse reinforcement learning is proposed to construct the reward function of a complex mission. Inverse reinforcement learning allows the agent to rebuild a reward function that imitates the process of interaction between the expert and the environment. During the imitation, the agent continuously compares the difference between the expert and itself, and then the proposed methods determines a specific weight for each state via Adaboost. The weight is then combined with the state confidence from UCB to construct an approximate reward function. This thesis uses a state encoding method and action segmentation to simplify the problem, then utilize the proposed method to determine the optimal reward function. Finally, a maze environment and a soccer robot environment simulation are used to validate the proposed method, further to decreasing the computational time. Kao-Shing Hwang 黃國勝 2015 學位論文 ; thesis 70 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立中山大學 === 電機工程學系研究所 === 104 === Reinforcement learning allows agents to learn behaviors through trial and error. However, as the level of difficulty increases, the reward function of the mission also becomes harder to be defined. By combining the concepts of Adaboost classifier and Upper Confidence Bounds (UCB), a method based on inverse reinforcement learning is proposed to construct the reward function of a complex mission. Inverse reinforcement learning allows the agent to rebuild a reward function that imitates the process of interaction between the expert and the environment. During the imitation, the agent continuously compares the difference between the expert and itself, and then the proposed methods determines a specific weight for each state via Adaboost. The weight is then combined with the state confidence from UCB to construct an approximate reward function. This thesis uses a state encoding method and action segmentation to simplify the problem, then utilize the proposed method to determine the optimal reward function. Finally, a maze environment and a soccer robot environment simulation are used to validate the proposed method, further to decreasing the computational time.
|
author2 |
Kao-Shing Hwang |
author_facet |
Kao-Shing Hwang Hsuan-yi Chiang 江炫儀 |
author |
Hsuan-yi Chiang 江炫儀 |
spellingShingle |
Hsuan-yi Chiang 江炫儀 Action Segmentation and Learning by Inverse Reinforcement Learning |
author_sort |
Hsuan-yi Chiang |
title |
Action Segmentation and Learning by Inverse Reinforcement Learning |
title_short |
Action Segmentation and Learning by Inverse Reinforcement Learning |
title_full |
Action Segmentation and Learning by Inverse Reinforcement Learning |
title_fullStr |
Action Segmentation and Learning by Inverse Reinforcement Learning |
title_full_unstemmed |
Action Segmentation and Learning by Inverse Reinforcement Learning |
title_sort |
action segmentation and learning by inverse reinforcement learning |
publishDate |
2015 |
url |
http://ndltd.ncl.edu.tw/handle/24130256006959006664 |
work_keys_str_mv |
AT hsuanyichiang actionsegmentationandlearningbyinversereinforcementlearning AT jiāngxuànyí actionsegmentationandlearningbyinversereinforcementlearning AT hsuanyichiang tòuguòfǎnjiāqiángshìxuéxímófǎngzuòxíngwèifēnduànjíxuéxí AT jiāngxuànyí tòuguòfǎnjiāqiángshìxuéxímófǎngzuòxíngwèifēnduànjíxuéxí |
_version_ |
1718508896563232768 |