Risk-Avoiding Reinforcement Learning

碩士 === 國立臺灣大學 === 資訊工程學研究所 === 99 === Traditional reinforcement learning agents focus on maximizing the expected cumulated rewards and ignore the distribution of the return. However, for some tasks people prefer actions that might not lead to as much return but more likely to avoid disaster. This th...

Full description

Bibliographic Details
Main Authors: Jung-Jung Yeh, 葉蓉蓉
Other Authors: Shou-de Lin
Format: Others
Language:en_US
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/11854708998176094577
id ndltd-TW-099NTU05392120
record_format oai_dc
spelling ndltd-TW-099NTU053921202015-10-16T04:03:11Z http://ndltd.ncl.edu.tw/handle/11854708998176094577 Risk-Avoiding Reinforcement Learning 可避免風險之強化學習演算法 Jung-Jung Yeh 葉蓉蓉 碩士 國立臺灣大學 資訊工程學研究所 99 Traditional reinforcement learning agents focus on maximizing the expected cumulated rewards and ignore the distribution of the return. However, for some tasks people prefer actions that might not lead to as much return but more likely to avoid disaster. This thesis proposes to define risk as the expected loss and accordingly design a risk-avoiding reinforcement learning agent. Our experiment shows that such risk-avoiding reinforcement learning agent can improve different types of risks such as variance of return, the maximal loss, the probability of fatal errors. The risk defined based on loss is capable of reducing the credit risk to the banks as well as the loss existing in stock marginal trading, which can hardly be coped effectively in the previous literatures. We design a Q-decomposed reinforcement learning system to handle the tradeoff between expected loss and return. The framework consists of two subagents and one arbiter. Subagents learn the expected loss and the expected return individually, and the arbiter evaluates the sum of the return and loss of each action and takes the best one. We perform two experiments: the grid world and Taiwanese Electronic Stock Index simulated trades. In the grid world, we evaluate the expected return and the expected loss of different level of risk-averse agents. We compare the risk-avoiding agent with the variance-penalized and risk sensitive agent in the stock trading experiment. The results show that our risk-avoiding agent can not only reduce the expected loss but also cut down other kinds of risks. Shou-de Lin 林守德 2011 學位論文 ; thesis 59 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立臺灣大學 === 資訊工程學研究所 === 99 === Traditional reinforcement learning agents focus on maximizing the expected cumulated rewards and ignore the distribution of the return. However, for some tasks people prefer actions that might not lead to as much return but more likely to avoid disaster. This thesis proposes to define risk as the expected loss and accordingly design a risk-avoiding reinforcement learning agent. Our experiment shows that such risk-avoiding reinforcement learning agent can improve different types of risks such as variance of return, the maximal loss, the probability of fatal errors. The risk defined based on loss is capable of reducing the credit risk to the banks as well as the loss existing in stock marginal trading, which can hardly be coped effectively in the previous literatures. We design a Q-decomposed reinforcement learning system to handle the tradeoff between expected loss and return. The framework consists of two subagents and one arbiter. Subagents learn the expected loss and the expected return individually, and the arbiter evaluates the sum of the return and loss of each action and takes the best one. We perform two experiments: the grid world and Taiwanese Electronic Stock Index simulated trades. In the grid world, we evaluate the expected return and the expected loss of different level of risk-averse agents. We compare the risk-avoiding agent with the variance-penalized and risk sensitive agent in the stock trading experiment. The results show that our risk-avoiding agent can not only reduce the expected loss but also cut down other kinds of risks.
author2 Shou-de Lin
author_facet Shou-de Lin
Jung-Jung Yeh
葉蓉蓉
author Jung-Jung Yeh
葉蓉蓉
spellingShingle Jung-Jung Yeh
葉蓉蓉
Risk-Avoiding Reinforcement Learning
author_sort Jung-Jung Yeh
title Risk-Avoiding Reinforcement Learning
title_short Risk-Avoiding Reinforcement Learning
title_full Risk-Avoiding Reinforcement Learning
title_fullStr Risk-Avoiding Reinforcement Learning
title_full_unstemmed Risk-Avoiding Reinforcement Learning
title_sort risk-avoiding reinforcement learning
publishDate 2011
url http://ndltd.ncl.edu.tw/handle/11854708998176094577
work_keys_str_mv AT jungjungyeh riskavoidingreinforcementlearning
AT yèróngróng riskavoidingreinforcementlearning
AT jungjungyeh kěbìmiǎnfēngxiǎnzhīqiánghuàxuéxíyǎnsuànfǎ
AT yèróngróng kěbìmiǎnfēngxiǎnzhīqiánghuàxuéxíyǎnsuànfǎ
_version_ 1718092921519996928