The Reinforcement Learning Behavior Unit Weights Searching based on Genetic Algorithm

碩士 === 國立中正大學 === 電機工程所 === 95 === This thesis proposes a scheme based on Stochastic Searching Network and (GA) Genetic Algorithm, and we use Reinforcement Learning method for action network weights searching problem. The SGRL learning scheme is a hybrid Genetic Algorithm, which integrates the Stoch...

Full description

Bibliographic Details
Main Authors: Tsung-Fei Tzou, 鄒璁飛
Other Authors: Kao-Shing Hwang
Format: Others
Language:zh-TW
Published: 2007
Online Access:http://ndltd.ncl.edu.tw/handle/67380234386530334498
id ndltd-TW-095CCU05442094
record_format oai_dc
spelling ndltd-TW-095CCU054420942015-10-13T11:31:38Z http://ndltd.ncl.edu.tw/handle/67380234386530334498 The Reinforcement Learning Behavior Unit Weights Searching based on Genetic Algorithm 植基於基因演算法之加強式學習中行為單元權重之搜尋 Tsung-Fei Tzou 鄒璁飛 碩士 國立中正大學 電機工程所 95 This thesis proposes a scheme based on Stochastic Searching Network and (GA) Genetic Algorithm, and we use Reinforcement Learning method for action network weights searching problem. The SGRL learning scheme is a hybrid Genetic Algorithm, which integrates the Stochastic Searching Network and the Genetic Algorithm to fulfill the Reinforcement Learning action network weights searching task. Structurally, the SGRL learning system is composed of two integrated feed-forward networks. One neural network acts as a critic network for helping the learning of the other network, the action network, which determines the outputs (actions) of the SGRL learning system, where the action network is a normal neural network. Using the TD (Temporal Difference) prediction method, the critic network can predict the external reinforcement signal and provide a more informative internal reinforcement signal to the action network. The action network uses the GA and according to the plant dynamic reference  model to adapt itself according to the internal reinforcement signal. The key concept of the SGRL learning scheme is to formulate the internal reinforcement signal contributed by the reference plant model as the fitness function for the GA. Computer simulations on controlling of the Acrobot (i.e. possessing fewer actuators than degrees of freedom) system and mountain-car system have been conducted to illustrate the performance and applicability of the proposed learning controller scheme. Kao-Shing Hwang 黃國勝 2007 學位論文 ; thesis 78 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立中正大學 === 電機工程所 === 95 === This thesis proposes a scheme based on Stochastic Searching Network and (GA) Genetic Algorithm, and we use Reinforcement Learning method for action network weights searching problem. The SGRL learning scheme is a hybrid Genetic Algorithm, which integrates the Stochastic Searching Network and the Genetic Algorithm to fulfill the Reinforcement Learning action network weights searching task. Structurally, the SGRL learning system is composed of two integrated feed-forward networks. One neural network acts as a critic network for helping the learning of the other network, the action network, which determines the outputs (actions) of the SGRL learning system, where the action network is a normal neural network. Using the TD (Temporal Difference) prediction method, the critic network can predict the external reinforcement signal and provide a more informative internal reinforcement signal to the action network. The action network uses the GA and according to the plant dynamic reference  model to adapt itself according to the internal reinforcement signal. The key concept of the SGRL learning scheme is to formulate the internal reinforcement signal contributed by the reference plant model as the fitness function for the GA. Computer simulations on controlling of the Acrobot (i.e. possessing fewer actuators than degrees of freedom) system and mountain-car system have been conducted to illustrate the performance and applicability of the proposed learning controller scheme.
author2 Kao-Shing Hwang
author_facet Kao-Shing Hwang
Tsung-Fei Tzou
鄒璁飛
author Tsung-Fei Tzou
鄒璁飛
spellingShingle Tsung-Fei Tzou
鄒璁飛
The Reinforcement Learning Behavior Unit Weights Searching based on Genetic Algorithm
author_sort Tsung-Fei Tzou
title The Reinforcement Learning Behavior Unit Weights Searching based on Genetic Algorithm
title_short The Reinforcement Learning Behavior Unit Weights Searching based on Genetic Algorithm
title_full The Reinforcement Learning Behavior Unit Weights Searching based on Genetic Algorithm
title_fullStr The Reinforcement Learning Behavior Unit Weights Searching based on Genetic Algorithm
title_full_unstemmed The Reinforcement Learning Behavior Unit Weights Searching based on Genetic Algorithm
title_sort reinforcement learning behavior unit weights searching based on genetic algorithm
publishDate 2007
url http://ndltd.ncl.edu.tw/handle/67380234386530334498
work_keys_str_mv AT tsungfeitzou thereinforcementlearningbehaviorunitweightssearchingbasedongeneticalgorithm
AT zōucōngfēi thereinforcementlearningbehaviorunitweightssearchingbasedongeneticalgorithm
AT tsungfeitzou zhíjīyújīyīnyǎnsuànfǎzhījiāqiángshìxuéxízhōngxíngwèidānyuánquánzhòngzhīsōuxún
AT zōucōngfēi zhíjīyújīyīnyǎnsuànfǎzhījiāqiángshìxuéxízhōngxíngwèidānyuánquánzhòngzhīsōuxún
AT tsungfeitzou reinforcementlearningbehaviorunitweightssearchingbasedongeneticalgorithm
AT zōucōngfēi reinforcementlearningbehaviorunitweightssearchingbasedongeneticalgorithm
_version_ 1716845085921902592