Action Segmentation and Learning by Inverse Reinforcement Learning

碩士 === 國立中山大學 === 電機工程學系研究所 === 104 === Reinforcement learning allows agents to learn behaviors through trial and error. However, as the level of difficulty increases, the reward function of the mission also becomes harder to be defined. By combining the concepts of Adaboost classifier and Upper C...

Full description

Bibliographic Details
Main Authors: Hsuan-yi Chiang, 江炫儀
Other Authors: Kao-Shing Hwang
Format: Others
Language:en_US
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/24130256006959006664
Description
Summary:碩士 === 國立中山大學 === 電機工程學系研究所 === 104 === Reinforcement learning allows agents to learn behaviors through trial and error. However, as the level of difficulty increases, the reward function of the mission also becomes harder to be defined. By combining the concepts of Adaboost classifier and Upper Confidence Bounds (UCB), a method based on inverse reinforcement learning is proposed to construct the reward function of a complex mission. Inverse reinforcement learning allows the agent to rebuild a reward function that imitates the process of interaction between the expert and the environment. During the imitation, the agent continuously compares the difference between the expert and itself, and then the proposed methods determines a specific weight for each state via Adaboost. The weight is then combined with the state confidence from UCB to construct an approximate reward function. This thesis uses a state encoding method and action segmentation to simplify the problem, then utilize the proposed method to determine the optimal reward function. Finally, a maze environment and a soccer robot environment simulation are used to validate the proposed method, further to decreasing the computational time.