An online adaptive learning algorithm for optimal trade execution in high-frequency markets
A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy in the Faculty of Science, School of Computer Science and Applied Mathematics University of the Witwatersrand. October 2016. === Automated algorithmic trade execution is a central problem in modern financi...
Main Author: | |
---|---|
Format: | Others |
Language: | en |
Published: |
2017
|
Subjects: | |
Online Access: | Hendricks, Dieter (2016) An online adaptive learning algorithm for optimal trade execution in high-frequency markets, University of Witwatersrand, Johannesburg, <http://wiredspace.wits.ac.za/handle/10539/21710> http://hdl.handle.net/10539/21710 |
id |
ndltd-netd.ac.za-oai-union.ndltd.org-wits-oai-wiredspace.wits.ac.za-10539-21710 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-netd.ac.za-oai-union.ndltd.org-wits-oai-wiredspace.wits.ac.za-10539-217102019-05-11T03:40:22Z An online adaptive learning algorithm for optimal trade execution in high-frequency markets Hendricks, Dieter Algorithms Financial markets A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy in the Faculty of Science, School of Computer Science and Applied Mathematics University of the Witwatersrand. October 2016. Automated algorithmic trade execution is a central problem in modern financial markets, however finding and navigating optimal trajectories in this system is a non-trivial task. Many authors have developed exact analytical solutions by making simplifying assumptions regarding governing dynamics, however for practical feasibility and robustness, a more dynamic approach is needed to capture the spatial and temporal system complexity and adapt as intraday regimes change. This thesis aims to consolidate four key ideas: 1) the financial market as a complex adaptive system, where purposeful agents with varying system visibility collectively and simultaneously create and perceive their environment as they interact with it; 2) spin glass models as a tractable formalism to model phenomena in this complex system; 3) the multivariate Hawkes process as a candidate governing process for limit order book events; and 4) reinforcement learning as a framework for online, adaptive learning. Combined with the data and computational challenges of developing an efficient, machine-scale trading algorithm, we present a feasible scheme which systematically encodes these ideas. We first determine the efficacy of the proposed learning framework, under the conjecture of approximate Markovian dynamics in the equity market. We find that a simple lookup table Q-learning algorithm, with discrete state attributes and discrete actions, is able to improve post-trade implementation shortfall by adapting a typical static arrival-price volume trajectory with respect to prevailing market microstructure features streaming from the limit order book. To enumerate a scale-specific state space whilst avoiding the curse of dimensionality, we propose a novel approach to detect the intraday temporal financial market state at each decision point in the Q-learning algorithm, inspired by the complex adaptive system paradigm. A physical analogy to the ferromagnetic Potts model at thermal equilibrium is used to develop a high-speed maximum likelihood clustering algorithm, appropriate for measuring critical or near-critical temporal states in the financial system. State features are studied to extract time-scale-specific state signature vectors, which serve as low-dimensional state descriptors and enable online state detection. To assess the impact of agent interactions on the system, a multivariate Hawkes process is used to measure the resiliency of the limit order book with respect to liquidity-demand events of varying size. By studying the branching ratios associated with key quote replenishment intensities following trades, we ensure that the limit order book is expected to be resilient with respect to the maximum permissible trade executed by the agent. Finally we present a feasible scheme for unsupervised state discovery, state detection and online learning for high-frequency quantitative trading agents faced with a multifeatured, asynchronous market data feed. We provide a technique for enumerating the state space at the scale at which the agent interacts with the system, incorporating the effects of a live trading agent on limit order book dynamics into the market data feed, and hence the perceived state evolution. LG2017 2017-01-20T05:34:55Z 2017-01-20T05:34:55Z 2016 Thesis Hendricks, Dieter (2016) An online adaptive learning algorithm for optimal trade execution in high-frequency markets, University of Witwatersrand, Johannesburg, <http://wiredspace.wits.ac.za/handle/10539/21710> http://hdl.handle.net/10539/21710 en Online resource (189 leaves) application/pdf |
collection |
NDLTD |
language |
en |
format |
Others
|
sources |
NDLTD |
topic |
Algorithms Financial markets |
spellingShingle |
Algorithms Financial markets Hendricks, Dieter An online adaptive learning algorithm for optimal trade execution in high-frequency markets |
description |
A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy
in the Faculty of Science, School of Computer Science and Applied Mathematics
University of the Witwatersrand. October 2016. === Automated algorithmic trade execution is a central problem in modern financial markets,
however finding and navigating optimal trajectories in this system is a non-trivial
task. Many authors have developed exact analytical solutions by making simplifying
assumptions regarding governing dynamics, however for practical feasibility and robustness,
a more dynamic approach is needed to capture the spatial and temporal system
complexity and adapt as intraday regimes change.
This thesis aims to consolidate four key ideas: 1) the financial market as a complex
adaptive system, where purposeful agents with varying system visibility collectively and
simultaneously create and perceive their environment as they interact with it; 2) spin
glass models as a tractable formalism to model phenomena in this complex system; 3) the
multivariate Hawkes process as a candidate governing process for limit order book events;
and 4) reinforcement learning as a framework for online, adaptive learning. Combined
with the data and computational challenges of developing an efficient, machine-scale
trading algorithm, we present a feasible scheme which systematically encodes these ideas.
We first determine the efficacy of the proposed learning framework, under the conjecture
of approximate Markovian dynamics in the equity market. We find that a simple lookup
table Q-learning algorithm, with discrete state attributes and discrete actions, is able
to improve post-trade implementation shortfall by adapting a typical static arrival-price
volume trajectory with respect to prevailing market microstructure features streaming
from the limit order book.
To enumerate a scale-specific state space whilst avoiding the curse of dimensionality, we
propose a novel approach to detect the intraday temporal financial market state at each
decision point in the Q-learning algorithm, inspired by the complex adaptive system
paradigm. A physical analogy to the ferromagnetic Potts model at thermal equilibrium
is used to develop a high-speed maximum likelihood clustering algorithm, appropriate
for measuring critical or near-critical temporal states in the financial system. State
features are studied to extract time-scale-specific state signature vectors, which serve as
low-dimensional state descriptors and enable online state detection.
To assess the impact of agent interactions on the system, a multivariate Hawkes process is
used to measure the resiliency of the limit order book with respect to liquidity-demand
events of varying size. By studying the branching ratios associated with key quote
replenishment intensities following trades, we ensure that the limit order book is expected
to be resilient with respect to the maximum permissible trade executed by the agent.
Finally we present a feasible scheme for unsupervised state discovery, state detection
and online learning for high-frequency quantitative trading agents faced with a multifeatured,
asynchronous market data feed. We provide a technique for enumerating the
state space at the scale at which the agent interacts with the system, incorporating the
effects of a live trading agent on limit order book dynamics into the market data feed,
and hence the perceived state evolution. === LG2017 |
author |
Hendricks, Dieter |
author_facet |
Hendricks, Dieter |
author_sort |
Hendricks, Dieter |
title |
An online adaptive learning algorithm for optimal trade execution in high-frequency markets |
title_short |
An online adaptive learning algorithm for optimal trade execution in high-frequency markets |
title_full |
An online adaptive learning algorithm for optimal trade execution in high-frequency markets |
title_fullStr |
An online adaptive learning algorithm for optimal trade execution in high-frequency markets |
title_full_unstemmed |
An online adaptive learning algorithm for optimal trade execution in high-frequency markets |
title_sort |
online adaptive learning algorithm for optimal trade execution in high-frequency markets |
publishDate |
2017 |
url |
Hendricks, Dieter (2016) An online adaptive learning algorithm for optimal trade execution in high-frequency markets, University of Witwatersrand, Johannesburg, <http://wiredspace.wits.ac.za/handle/10539/21710> http://hdl.handle.net/10539/21710 |
work_keys_str_mv |
AT hendricksdieter anonlineadaptivelearningalgorithmforoptimaltradeexecutioninhighfrequencymarkets AT hendricksdieter onlineadaptivelearningalgorithmforoptimaltradeexecutioninhighfrequencymarkets |
_version_ |
1719081517823557632 |