Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm

碩士 === 國立交通大學 === 電機與控制工程系所 === 97 === This article is about sequential perturbation learning architecture through safe reinforcement learning (SRL-SP) which based on the concept of linear search to apply perturbations on each weight value of the neural network. The evaluation of value of function b...

Full description

Bibliographic Details
Main Authors:	Ho, Chang-An, 何長安
Other Authors:	Lin, Sheng-Fuu
Format:	Others
Language:	zh-TW
Published:	2009
Online Access:	http://ndltd.ncl.edu.tw/handle/63234750154932788712

id	ndltd-TW-097NCTU5591131
record_format	oai_dc
spelling	ndltd-TW-097NCTU55911312015-10-13T15:42:34Z http://ndltd.ncl.edu.tw/handle/63234750154932788712 Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm 基於安全性增強式學習之循序擾動學習演算法 Ho, Chang-An 何長安碩士國立交通大學電機與控制工程系所 97 This article is about sequential perturbation learning architecture through safe reinforcement learning (SRL-SP) which based on the concept of linear search to apply perturbations on each weight value of the neural network. The evaluation of value of function between pre-perturb and post-perturb network is executed after the perturbations are applied, so as to update the weights. Applying perturbations can avoid the solution form the phenomenon which falls into the hands of local solution and oscillating in the solution space that decreases the learning efficiency. Besides, in the reinforcement learning structure, use the Lyapunov design methods to set the learning objective and pre-defined set of the goal state. This method would greatly reduces the learning time, in other words, it can rapidly guide the plant’s state into the goal state. During the simulation, use the n-mass inverted pendulum model to perform the experiment of humanoid robot model. To prove the method in this article is more effective in learning. Lin, Sheng-Fuu 林昇甫 2009 學位論文 ; thesis 89 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立交通大學 === 電機與控制工程系所 === 97 === This article is about sequential perturbation learning architecture through safe reinforcement learning (SRL-SP) which based on the concept of linear search to apply perturbations on each weight value of the neural network. The evaluation of value of function between pre-perturb and post-perturb network is executed after the perturbations are applied, so as to update the weights. Applying perturbations can avoid the solution form the phenomenon which falls into the hands of local solution and oscillating in the solution space that decreases the learning efficiency. Besides, in the reinforcement learning structure, use the Lyapunov design methods to set the learning objective and pre-defined set of the goal state. This method would greatly reduces the learning time, in other words, it can rapidly guide the plant’s state into the goal state. During the simulation, use the n-mass inverted pendulum model to perform the experiment of humanoid robot model. To prove the method in this article is more effective in learning.
author2	Lin, Sheng-Fuu
author_facet	Lin, Sheng-Fuu Ho, Chang-An 何長安
author	Ho, Chang-An 何長安
spellingShingle	Ho, Chang-An 何長安 Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm
author_sort	Ho, Chang-An
title	Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm
title_short	Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm
title_full	Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm
title_fullStr	Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm
title_full_unstemmed	Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm
title_sort	safe reinforcement learning based sequential perturbation learning algorithm
publishDate	2009
url	http://ndltd.ncl.edu.tw/handle/63234750154932788712
work_keys_str_mv	AT hochangan safereinforcementlearningbasedsequentialperturbationlearningalgorithm AT hézhǎngān safereinforcementlearningbasedsequentialperturbationlearningalgorithm AT hochangan jīyúānquánxìngzēngqiángshìxuéxízhīxúnxùrǎodòngxuéxíyǎnsuànfǎ AT hézhǎngān jīyúānquánxìngzēngqiángshìxuéxízhīxúnxùrǎodòngxuéxíyǎnsuànfǎ
_version_	1717768430870855680

Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm

Similar Items