Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm
碩士 === 國立交通大學 === 電機與控制工程系所 === 97 === This article is about sequential perturbation learning architecture through safe reinforcement learning (SRL-SP) which based on the concept of linear search to apply perturbations on each weight value of the neural network. The evaluation of value of function b...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2009
|
Online Access: | http://ndltd.ncl.edu.tw/handle/63234750154932788712 |
id |
ndltd-TW-097NCTU5591131 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-097NCTU55911312015-10-13T15:42:34Z http://ndltd.ncl.edu.tw/handle/63234750154932788712 Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm 基於安全性增強式學習之循序擾動學習演算法 Ho, Chang-An 何長安 碩士 國立交通大學 電機與控制工程系所 97 This article is about sequential perturbation learning architecture through safe reinforcement learning (SRL-SP) which based on the concept of linear search to apply perturbations on each weight value of the neural network. The evaluation of value of function between pre-perturb and post-perturb network is executed after the perturbations are applied, so as to update the weights. Applying perturbations can avoid the solution form the phenomenon which falls into the hands of local solution and oscillating in the solution space that decreases the learning efficiency. Besides, in the reinforcement learning structure, use the Lyapunov design methods to set the learning objective and pre-defined set of the goal state. This method would greatly reduces the learning time, in other words, it can rapidly guide the plant’s state into the goal state. During the simulation, use the n-mass inverted pendulum model to perform the experiment of humanoid robot model. To prove the method in this article is more effective in learning. Lin, Sheng-Fuu 林昇甫 2009 學位論文 ; thesis 89 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立交通大學 === 電機與控制工程系所 === 97 === This article is about sequential perturbation learning architecture through safe reinforcement learning (SRL-SP) which based on the concept of linear search to apply perturbations on each weight value of the neural network. The evaluation of value of function between pre-perturb and post-perturb network is executed after the perturbations are applied, so as to update the weights. Applying perturbations can avoid the solution form the phenomenon which falls into the hands of local solution and oscillating in the solution space that decreases the learning efficiency. Besides, in the reinforcement learning structure, use the Lyapunov design methods to set the learning objective and pre-defined set of the goal state. This method would greatly reduces the learning time, in other words, it can rapidly guide the plant’s state into the goal state. During the simulation, use the n-mass inverted pendulum model to perform the experiment of humanoid robot model. To prove the method in this article is more effective in learning.
|
author2 |
Lin, Sheng-Fuu |
author_facet |
Lin, Sheng-Fuu Ho, Chang-An 何長安 |
author |
Ho, Chang-An 何長安 |
spellingShingle |
Ho, Chang-An 何長安 Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm |
author_sort |
Ho, Chang-An |
title |
Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm |
title_short |
Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm |
title_full |
Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm |
title_fullStr |
Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm |
title_full_unstemmed |
Safe Reinforcement Learning based Sequential Perturbation Learning Algorithm |
title_sort |
safe reinforcement learning based sequential perturbation learning algorithm |
publishDate |
2009 |
url |
http://ndltd.ncl.edu.tw/handle/63234750154932788712 |
work_keys_str_mv |
AT hochangan safereinforcementlearningbasedsequentialperturbationlearningalgorithm AT hézhǎngān safereinforcementlearningbasedsequentialperturbationlearningalgorithm AT hochangan jīyúānquánxìngzēngqiángshìxuéxízhīxúnxùrǎodòngxuéxíyǎnsuànfǎ AT hézhǎngān jīyúānquánxìngzēngqiángshìxuéxízhīxúnxùrǎodòngxuéxíyǎnsuànfǎ |
_version_ |
1717768430870855680 |