The Reinforcement Learning Behavior Unit Weights Searching based on Genetic Algorithm
碩士 === 國立中正大學 === 電機工程所 === 95 === This thesis proposes a scheme based on Stochastic Searching Network and (GA) Genetic Algorithm, and we use Reinforcement Learning method for action network weights searching problem. The SGRL learning scheme is a hybrid Genetic Algorithm, which integrates the Stoch...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2007
|
Online Access: | http://ndltd.ncl.edu.tw/handle/67380234386530334498 |
id |
ndltd-TW-095CCU05442094 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-095CCU054420942015-10-13T11:31:38Z http://ndltd.ncl.edu.tw/handle/67380234386530334498 The Reinforcement Learning Behavior Unit Weights Searching based on Genetic Algorithm 植基於基因演算法之加強式學習中行為單元權重之搜尋 Tsung-Fei Tzou 鄒璁飛 碩士 國立中正大學 電機工程所 95 This thesis proposes a scheme based on Stochastic Searching Network and (GA) Genetic Algorithm, and we use Reinforcement Learning method for action network weights searching problem. The SGRL learning scheme is a hybrid Genetic Algorithm, which integrates the Stochastic Searching Network and the Genetic Algorithm to fulfill the Reinforcement Learning action network weights searching task. Structurally, the SGRL learning system is composed of two integrated feed-forward networks. One neural network acts as a critic network for helping the learning of the other network, the action network, which determines the outputs (actions) of the SGRL learning system, where the action network is a normal neural network. Using the TD (Temporal Difference) prediction method, the critic network can predict the external reinforcement signal and provide a more informative internal reinforcement signal to the action network. The action network uses the GA and according to the plant dynamic reference model to adapt itself according to the internal reinforcement signal. The key concept of the SGRL learning scheme is to formulate the internal reinforcement signal contributed by the reference plant model as the fitness function for the GA. Computer simulations on controlling of the Acrobot (i.e. possessing fewer actuators than degrees of freedom) system and mountain-car system have been conducted to illustrate the performance and applicability of the proposed learning controller scheme. Kao-Shing Hwang 黃國勝 2007 學位論文 ; thesis 78 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立中正大學 === 電機工程所 === 95 === This thesis proposes a scheme based on Stochastic Searching Network and (GA) Genetic Algorithm, and we use Reinforcement Learning method for action network weights searching problem. The SGRL learning scheme is a hybrid Genetic Algorithm, which integrates the Stochastic Searching Network and the Genetic Algorithm to fulfill the Reinforcement Learning action network weights searching task. Structurally, the SGRL learning system is composed of two integrated feed-forward networks. One neural network acts as a critic network for helping the learning of the other network, the action network, which determines the outputs (actions) of the SGRL learning system, where the action network is a normal neural network. Using the TD (Temporal Difference) prediction method, the critic network can predict the external reinforcement signal and provide a more informative internal reinforcement signal to the action network. The action network uses the GA and according to the plant dynamic reference model to adapt itself according to the internal reinforcement signal. The key concept of the SGRL learning scheme is to formulate the internal reinforcement signal contributed by the reference plant model as the fitness function for the GA.
Computer simulations on controlling of the Acrobot (i.e. possessing fewer actuators than degrees of freedom) system and mountain-car system have been conducted to illustrate the performance and applicability of the proposed learning controller scheme.
|
author2 |
Kao-Shing Hwang |
author_facet |
Kao-Shing Hwang Tsung-Fei Tzou 鄒璁飛 |
author |
Tsung-Fei Tzou 鄒璁飛 |
spellingShingle |
Tsung-Fei Tzou 鄒璁飛 The Reinforcement Learning Behavior Unit Weights Searching based on Genetic Algorithm |
author_sort |
Tsung-Fei Tzou |
title |
The Reinforcement Learning Behavior Unit Weights Searching based on Genetic Algorithm |
title_short |
The Reinforcement Learning Behavior Unit Weights Searching based on Genetic Algorithm |
title_full |
The Reinforcement Learning Behavior Unit Weights Searching based on Genetic Algorithm |
title_fullStr |
The Reinforcement Learning Behavior Unit Weights Searching based on Genetic Algorithm |
title_full_unstemmed |
The Reinforcement Learning Behavior Unit Weights Searching based on Genetic Algorithm |
title_sort |
reinforcement learning behavior unit weights searching based on genetic algorithm |
publishDate |
2007 |
url |
http://ndltd.ncl.edu.tw/handle/67380234386530334498 |
work_keys_str_mv |
AT tsungfeitzou thereinforcementlearningbehaviorunitweightssearchingbasedongeneticalgorithm AT zōucōngfēi thereinforcementlearningbehaviorunitweightssearchingbasedongeneticalgorithm AT tsungfeitzou zhíjīyújīyīnyǎnsuànfǎzhījiāqiángshìxuéxízhōngxíngwèidānyuánquánzhòngzhīsōuxún AT zōucōngfēi zhíjīyújīyīnyǎnsuànfǎzhījiāqiángshìxuéxízhōngxíngwèidānyuánquánzhòngzhīsōuxún AT tsungfeitzou reinforcementlearningbehaviorunitweightssearchingbasedongeneticalgorithm AT zōucōngfēi reinforcementlearningbehaviorunitweightssearchingbasedongeneticalgorithm |
_version_ |
1716845085921902592 |