A Novel Deep Reinforcement Algorithm with Adaptive Sampling Strategy for Continuous Portfolio Optimization

碩士 === 國立交通大學 === 資訊管理研究所 === 107 === Quantitative trading finds stable and profitable trading strategies by observing historical data through statistics or mathematics methods. With the advancement of technology and the development of computing equipment, many studies that prove that deep reinforce...

Full description

Bibliographic Details
Main Authors: Miao, Yu-Hsiang, 繆宇翔
Other Authors: Chen, An-Pin
Format: Others
Language:en_US
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/wvv2ff
id ndltd-TW-107NCTU5396024
record_format oai_dc
spelling ndltd-TW-107NCTU53960242019-11-26T05:16:53Z http://ndltd.ncl.edu.tw/handle/wvv2ff A Novel Deep Reinforcement Algorithm with Adaptive Sampling Strategy for Continuous Portfolio Optimization 以深度強化學習結合自適應取樣策略於連續投資組合最佳化 Miao, Yu-Hsiang 繆宇翔 碩士 國立交通大學 資訊管理研究所 107 Quantitative trading finds stable and profitable trading strategies by observing historical data through statistics or mathematics methods. With the advancement of technology and the development of computing equipment, many studies that prove that deep reinforcement learning can perform well in quantitative trading without too many assumptions about the financial market, but these studies are still insufficient for the generalization of trading strategies. To strengthen the generalization ability of trading strategy, this study takes the constituents of the Dow Jones Industrial Average as the target and applies the problem of optimizing the portfolio. The goal is to construct a portfolio of five assets from the constituent stocks, and this portfolio could achieve excellent performance through our trading strategy. To optimize the setting of such problems, it is necessary for the agent to simulate and explore the possibilities. But the process of simulating all possibilities requires a lot of calculations and time. Hence, this study proposes a sampling strategy to determine which data is worth learning by observing the learning condition. By applying this strategy, the agent could learn the general trading strategy more effective within a limited period of time. In addition to the sampling strategy, we use adversarial learning during reinforcement’s learning process to enhance the model’s robustness. From the result of the experiment, we could observe that the model with our sampling strategy are better than the random learning strategy. The Sharpe ratio is increased by 6-7% and the profit value has increased by nearly 45%. The outcome of the experiment demonstrates that our proposed learning framework with the sampling strategy is conducive to obtaining reliable trading rules. Chen, An-Pin Huang, Szu-Hao 陳安斌 黃思皓 2019 學位論文 ; thesis 71 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立交通大學 === 資訊管理研究所 === 107 === Quantitative trading finds stable and profitable trading strategies by observing historical data through statistics or mathematics methods. With the advancement of technology and the development of computing equipment, many studies that prove that deep reinforcement learning can perform well in quantitative trading without too many assumptions about the financial market, but these studies are still insufficient for the generalization of trading strategies. To strengthen the generalization ability of trading strategy, this study takes the constituents of the Dow Jones Industrial Average as the target and applies the problem of optimizing the portfolio. The goal is to construct a portfolio of five assets from the constituent stocks, and this portfolio could achieve excellent performance through our trading strategy. To optimize the setting of such problems, it is necessary for the agent to simulate and explore the possibilities. But the process of simulating all possibilities requires a lot of calculations and time. Hence, this study proposes a sampling strategy to determine which data is worth learning by observing the learning condition. By applying this strategy, the agent could learn the general trading strategy more effective within a limited period of time. In addition to the sampling strategy, we use adversarial learning during reinforcement’s learning process to enhance the model’s robustness. From the result of the experiment, we could observe that the model with our sampling strategy are better than the random learning strategy. The Sharpe ratio is increased by 6-7% and the profit value has increased by nearly 45%. The outcome of the experiment demonstrates that our proposed learning framework with the sampling strategy is conducive to obtaining reliable trading rules.
author2 Chen, An-Pin
author_facet Chen, An-Pin
Miao, Yu-Hsiang
繆宇翔
author Miao, Yu-Hsiang
繆宇翔
spellingShingle Miao, Yu-Hsiang
繆宇翔
A Novel Deep Reinforcement Algorithm with Adaptive Sampling Strategy for Continuous Portfolio Optimization
author_sort Miao, Yu-Hsiang
title A Novel Deep Reinforcement Algorithm with Adaptive Sampling Strategy for Continuous Portfolio Optimization
title_short A Novel Deep Reinforcement Algorithm with Adaptive Sampling Strategy for Continuous Portfolio Optimization
title_full A Novel Deep Reinforcement Algorithm with Adaptive Sampling Strategy for Continuous Portfolio Optimization
title_fullStr A Novel Deep Reinforcement Algorithm with Adaptive Sampling Strategy for Continuous Portfolio Optimization
title_full_unstemmed A Novel Deep Reinforcement Algorithm with Adaptive Sampling Strategy for Continuous Portfolio Optimization
title_sort novel deep reinforcement algorithm with adaptive sampling strategy for continuous portfolio optimization
publishDate 2019
url http://ndltd.ncl.edu.tw/handle/wvv2ff
work_keys_str_mv AT miaoyuhsiang anoveldeepreinforcementalgorithmwithadaptivesamplingstrategyforcontinuousportfoliooptimization
AT móuyǔxiáng anoveldeepreinforcementalgorithmwithadaptivesamplingstrategyforcontinuousportfoliooptimization
AT miaoyuhsiang yǐshēndùqiánghuàxuéxíjiéhézìshìyīngqǔyàngcèlüèyúliánxùtóuzīzǔhézuìjiāhuà
AT móuyǔxiáng yǐshēndùqiánghuàxuéxíjiéhézìshìyīngqǔyàngcèlüèyúliánxùtóuzīzǔhézuìjiāhuà
AT miaoyuhsiang noveldeepreinforcementalgorithmwithadaptivesamplingstrategyforcontinuousportfoliooptimization
AT móuyǔxiáng noveldeepreinforcementalgorithmwithadaptivesamplingstrategyforcontinuousportfoliooptimization
_version_ 1719296434501582848