An online adaptive learning algorithm for optimal trade execution in high-frequency markets

A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy in the Faculty of Science, School of Computer Science and Applied Mathematics University of the Witwatersrand. October 2016. === Automated algorithmic trade execution is a central problem in modern financi...

Full description

Bibliographic Details
Main Author: Hendricks, Dieter
Format: Others
Language:en
Published: 2017
Subjects:
Online Access:Hendricks, Dieter (2016) An online adaptive learning algorithm for optimal trade execution in high-frequency markets, University of Witwatersrand, Johannesburg, <http://wiredspace.wits.ac.za/handle/10539/21710>
http://hdl.handle.net/10539/21710
id ndltd-netd.ac.za-oai-union.ndltd.org-wits-oai-wiredspace.wits.ac.za-10539-21710
record_format oai_dc
spelling ndltd-netd.ac.za-oai-union.ndltd.org-wits-oai-wiredspace.wits.ac.za-10539-217102019-05-11T03:40:22Z An online adaptive learning algorithm for optimal trade execution in high-frequency markets Hendricks, Dieter Algorithms Financial markets A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy in the Faculty of Science, School of Computer Science and Applied Mathematics University of the Witwatersrand. October 2016. Automated algorithmic trade execution is a central problem in modern financial markets, however finding and navigating optimal trajectories in this system is a non-trivial task. Many authors have developed exact analytical solutions by making simplifying assumptions regarding governing dynamics, however for practical feasibility and robustness, a more dynamic approach is needed to capture the spatial and temporal system complexity and adapt as intraday regimes change. This thesis aims to consolidate four key ideas: 1) the financial market as a complex adaptive system, where purposeful agents with varying system visibility collectively and simultaneously create and perceive their environment as they interact with it; 2) spin glass models as a tractable formalism to model phenomena in this complex system; 3) the multivariate Hawkes process as a candidate governing process for limit order book events; and 4) reinforcement learning as a framework for online, adaptive learning. Combined with the data and computational challenges of developing an efficient, machine-scale trading algorithm, we present a feasible scheme which systematically encodes these ideas. We first determine the efficacy of the proposed learning framework, under the conjecture of approximate Markovian dynamics in the equity market. We find that a simple lookup table Q-learning algorithm, with discrete state attributes and discrete actions, is able to improve post-trade implementation shortfall by adapting a typical static arrival-price volume trajectory with respect to prevailing market microstructure features streaming from the limit order book. To enumerate a scale-specific state space whilst avoiding the curse of dimensionality, we propose a novel approach to detect the intraday temporal financial market state at each decision point in the Q-learning algorithm, inspired by the complex adaptive system paradigm. A physical analogy to the ferromagnetic Potts model at thermal equilibrium is used to develop a high-speed maximum likelihood clustering algorithm, appropriate for measuring critical or near-critical temporal states in the financial system. State features are studied to extract time-scale-specific state signature vectors, which serve as low-dimensional state descriptors and enable online state detection. To assess the impact of agent interactions on the system, a multivariate Hawkes process is used to measure the resiliency of the limit order book with respect to liquidity-demand events of varying size. By studying the branching ratios associated with key quote replenishment intensities following trades, we ensure that the limit order book is expected to be resilient with respect to the maximum permissible trade executed by the agent. Finally we present a feasible scheme for unsupervised state discovery, state detection and online learning for high-frequency quantitative trading agents faced with a multifeatured, asynchronous market data feed. We provide a technique for enumerating the state space at the scale at which the agent interacts with the system, incorporating the effects of a live trading agent on limit order book dynamics into the market data feed, and hence the perceived state evolution. LG2017 2017-01-20T05:34:55Z 2017-01-20T05:34:55Z 2016 Thesis Hendricks, Dieter (2016) An online adaptive learning algorithm for optimal trade execution in high-frequency markets, University of Witwatersrand, Johannesburg, <http://wiredspace.wits.ac.za/handle/10539/21710> http://hdl.handle.net/10539/21710 en Online resource (189 leaves) application/pdf
collection NDLTD
language en
format Others
sources NDLTD
topic Algorithms
Financial markets
spellingShingle Algorithms
Financial markets
Hendricks, Dieter
An online adaptive learning algorithm for optimal trade execution in high-frequency markets
description A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy in the Faculty of Science, School of Computer Science and Applied Mathematics University of the Witwatersrand. October 2016. === Automated algorithmic trade execution is a central problem in modern financial markets, however finding and navigating optimal trajectories in this system is a non-trivial task. Many authors have developed exact analytical solutions by making simplifying assumptions regarding governing dynamics, however for practical feasibility and robustness, a more dynamic approach is needed to capture the spatial and temporal system complexity and adapt as intraday regimes change. This thesis aims to consolidate four key ideas: 1) the financial market as a complex adaptive system, where purposeful agents with varying system visibility collectively and simultaneously create and perceive their environment as they interact with it; 2) spin glass models as a tractable formalism to model phenomena in this complex system; 3) the multivariate Hawkes process as a candidate governing process for limit order book events; and 4) reinforcement learning as a framework for online, adaptive learning. Combined with the data and computational challenges of developing an efficient, machine-scale trading algorithm, we present a feasible scheme which systematically encodes these ideas. We first determine the efficacy of the proposed learning framework, under the conjecture of approximate Markovian dynamics in the equity market. We find that a simple lookup table Q-learning algorithm, with discrete state attributes and discrete actions, is able to improve post-trade implementation shortfall by adapting a typical static arrival-price volume trajectory with respect to prevailing market microstructure features streaming from the limit order book. To enumerate a scale-specific state space whilst avoiding the curse of dimensionality, we propose a novel approach to detect the intraday temporal financial market state at each decision point in the Q-learning algorithm, inspired by the complex adaptive system paradigm. A physical analogy to the ferromagnetic Potts model at thermal equilibrium is used to develop a high-speed maximum likelihood clustering algorithm, appropriate for measuring critical or near-critical temporal states in the financial system. State features are studied to extract time-scale-specific state signature vectors, which serve as low-dimensional state descriptors and enable online state detection. To assess the impact of agent interactions on the system, a multivariate Hawkes process is used to measure the resiliency of the limit order book with respect to liquidity-demand events of varying size. By studying the branching ratios associated with key quote replenishment intensities following trades, we ensure that the limit order book is expected to be resilient with respect to the maximum permissible trade executed by the agent. Finally we present a feasible scheme for unsupervised state discovery, state detection and online learning for high-frequency quantitative trading agents faced with a multifeatured, asynchronous market data feed. We provide a technique for enumerating the state space at the scale at which the agent interacts with the system, incorporating the effects of a live trading agent on limit order book dynamics into the market data feed, and hence the perceived state evolution. === LG2017
author Hendricks, Dieter
author_facet Hendricks, Dieter
author_sort Hendricks, Dieter
title An online adaptive learning algorithm for optimal trade execution in high-frequency markets
title_short An online adaptive learning algorithm for optimal trade execution in high-frequency markets
title_full An online adaptive learning algorithm for optimal trade execution in high-frequency markets
title_fullStr An online adaptive learning algorithm for optimal trade execution in high-frequency markets
title_full_unstemmed An online adaptive learning algorithm for optimal trade execution in high-frequency markets
title_sort online adaptive learning algorithm for optimal trade execution in high-frequency markets
publishDate 2017
url Hendricks, Dieter (2016) An online adaptive learning algorithm for optimal trade execution in high-frequency markets, University of Witwatersrand, Johannesburg, <http://wiredspace.wits.ac.za/handle/10539/21710>
http://hdl.handle.net/10539/21710
work_keys_str_mv AT hendricksdieter anonlineadaptivelearningalgorithmforoptimaltradeexecutioninhighfrequencymarkets
AT hendricksdieter onlineadaptivelearningalgorithmforoptimaltradeexecutioninhighfrequencymarkets
_version_ 1719081517823557632