Summary: | 碩士 === 國立中正大學 === 電機工程所 === 95 === This thesis proposes a scheme based on Stochastic Searching Network and (GA) Genetic Algorithm, and we use Reinforcement Learning method for action network weights searching problem. The SGRL learning scheme is a hybrid Genetic Algorithm, which integrates the Stochastic Searching Network and the Genetic Algorithm to fulfill the Reinforcement Learning action network weights searching task. Structurally, the SGRL learning system is composed of two integrated feed-forward networks. One neural network acts as a critic network for helping the learning of the other network, the action network, which determines the outputs (actions) of the SGRL learning system, where the action network is a normal neural network. Using the TD (Temporal Difference) prediction method, the critic network can predict the external reinforcement signal and provide a more informative internal reinforcement signal to the action network. The action network uses the GA and according to the plant dynamic reference model to adapt itself according to the internal reinforcement signal. The key concept of the SGRL learning scheme is to formulate the internal reinforcement signal contributed by the reference plant model as the fitness function for the GA.
Computer simulations on controlling of the Acrobot (i.e. possessing fewer actuators than degrees of freedom) system and mountain-car system have been conducted to illustrate the performance and applicability of the proposed learning controller scheme.
|