Risk-Avoiding Reinforcement Learning

碩士 === 國立臺灣大學 === 資訊工程學研究所 === 99 === Traditional reinforcement learning agents focus on maximizing the expected cumulated rewards and ignore the distribution of the return. However, for some tasks people prefer actions that might not lead to as much return but more likely to avoid disaster. This th...

Full description

Bibliographic Details
Main Authors:	Jung-Jung Yeh, 葉蓉蓉
Other Authors:	Shou-de Lin
Format:	Others
Language:	en_US
Published:	2011
Online Access:	http://ndltd.ncl.edu.tw/handle/11854708998176094577

id	ndltd-TW-099NTU05392120
record_format	oai_dc
spelling	ndltd-TW-099NTU053921202015-10-16T04:03:11Z http://ndltd.ncl.edu.tw/handle/11854708998176094577 Risk-Avoiding Reinforcement Learning 可避免風險之強化學習演算法 Jung-Jung Yeh 葉蓉蓉碩士國立臺灣大學資訊工程學研究所 99 Traditional reinforcement learning agents focus on maximizing the expected cumulated rewards and ignore the distribution of the return. However, for some tasks people prefer actions that might not lead to as much return but more likely to avoid disaster. This thesis proposes to define risk as the expected loss and accordingly design a risk-avoiding reinforcement learning agent. Our experiment shows that such risk-avoiding reinforcement learning agent can improve different types of risks such as variance of return, the maximal loss, the probability of fatal errors. The risk defined based on loss is capable of reducing the credit risk to the banks as well as the loss existing in stock marginal trading, which can hardly be coped effectively in the previous literatures. We design a Q-decomposed reinforcement learning system to handle the tradeoff between expected loss and return. The framework consists of two subagents and one arbiter. Subagents learn the expected loss and the expected return individually, and the arbiter evaluates the sum of the return and loss of each action and takes the best one. We perform two experiments: the grid world and Taiwanese Electronic Stock Index simulated trades. In the grid world, we evaluate the expected return and the expected loss of different level of risk-averse agents. We compare the risk-avoiding agent with the variance-penalized and risk sensitive agent in the stock trading experiment. The results show that our risk-avoiding agent can not only reduce the expected loss but also cut down other kinds of risks. Shou-de Lin 林守德 2011 學位論文 ; thesis 59 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立臺灣大學 === 資訊工程學研究所 === 99 === Traditional reinforcement learning agents focus on maximizing the expected cumulated rewards and ignore the distribution of the return. However, for some tasks people prefer actions that might not lead to as much return but more likely to avoid disaster. This thesis proposes to define risk as the expected loss and accordingly design a risk-avoiding reinforcement learning agent. Our experiment shows that such risk-avoiding reinforcement learning agent can improve different types of risks such as variance of return, the maximal loss, the probability of fatal errors. The risk defined based on loss is capable of reducing the credit risk to the banks as well as the loss existing in stock marginal trading, which can hardly be coped effectively in the previous literatures. We design a Q-decomposed reinforcement learning system to handle the tradeoff between expected loss and return. The framework consists of two subagents and one arbiter. Subagents learn the expected loss and the expected return individually, and the arbiter evaluates the sum of the return and loss of each action and takes the best one. We perform two experiments: the grid world and Taiwanese Electronic Stock Index simulated trades. In the grid world, we evaluate the expected return and the expected loss of different level of risk-averse agents. We compare the risk-avoiding agent with the variance-penalized and risk sensitive agent in the stock trading experiment. The results show that our risk-avoiding agent can not only reduce the expected loss but also cut down other kinds of risks.
author2	Shou-de Lin
author_facet	Shou-de Lin Jung-Jung Yeh 葉蓉蓉
author	Jung-Jung Yeh 葉蓉蓉
spellingShingle	Jung-Jung Yeh 葉蓉蓉 Risk-Avoiding Reinforcement Learning
author_sort	Jung-Jung Yeh
title	Risk-Avoiding Reinforcement Learning
title_short	Risk-Avoiding Reinforcement Learning
title_full	Risk-Avoiding Reinforcement Learning
title_fullStr	Risk-Avoiding Reinforcement Learning
title_full_unstemmed	Risk-Avoiding Reinforcement Learning
title_sort	risk-avoiding reinforcement learning
publishDate	2011
url	http://ndltd.ncl.edu.tw/handle/11854708998176094577
work_keys_str_mv	AT jungjungyeh riskavoidingreinforcementlearning AT yèróngróng riskavoidingreinforcementlearning AT jungjungyeh kěbìmiǎnfēngxiǎnzhīqiánghuàxuéxíyǎnsuànfǎ AT yèróngróng kěbìmiǎnfēngxiǎnzhīqiánghuàxuéxíyǎnsuànfǎ
_version_	1718092921519996928

Risk-Avoiding Reinforcement Learning

Similar Items