Learning to Tune a Class of Controllers with Deep Reinforcement Learning
Control systems require maintenance in the form of tuning their parameters in order to maximize their performance in the face of process changes in minerals processing circuits. This work focuses on using deep reinforcement learning to train an agent to perform this maintenance continuously. A gener...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-09-01
|
Series: | Minerals |
Subjects: | |
Online Access: | https://www.mdpi.com/2075-163X/11/9/989 |
id |
doaj-8de51a719b594c409c07933acff4f296 |
---|---|
record_format |
Article |
spelling |
doaj-8de51a719b594c409c07933acff4f2962021-09-26T00:44:59ZengMDPI AGMinerals2075-163X2021-09-011198998910.3390/min11090989Learning to Tune a Class of Controllers with Deep Reinforcement LearningWilliam John Shipman0Measurement and Control Division, Mintek, Johannesburg 2194, South AfricaControl systems require maintenance in the form of tuning their parameters in order to maximize their performance in the face of process changes in minerals processing circuits. This work focuses on using deep reinforcement learning to train an agent to perform this maintenance continuously. A generic simulation of a first-order process with a time delay, controlled by a proportional-integral controller, was used as the training environment. Domain randomization in this environment was used to aid in generalizing the agent to unseen conditions on a physical circuit. Proximal policy optimization was used to train the agent, and hyper-parameter optimization was performed to select the optimal agent neural network size and training algorithm parameters. Two agents were tested, examining the impact of the observation space used by the agent and concluding that the best observation consists of the parameters of an auto-regressive with exogenous input model fitted to the measurements of the controlled variable. The best trained agent was deployed at an industrial comminution circuit where it was tested on two flow rate control loops. This agent improved the performance of one of these control loops but decreased the performance of the other control loop. While deep reinforcement learning does show promise in controller tuning, several challenges and directions for further study have been identified.https://www.mdpi.com/2075-163X/11/9/989reinforcement learningdeep neural networkproportional-integral control |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
William John Shipman |
spellingShingle |
William John Shipman Learning to Tune a Class of Controllers with Deep Reinforcement Learning Minerals reinforcement learning deep neural network proportional-integral control |
author_facet |
William John Shipman |
author_sort |
William John Shipman |
title |
Learning to Tune a Class of Controllers with Deep Reinforcement Learning |
title_short |
Learning to Tune a Class of Controllers with Deep Reinforcement Learning |
title_full |
Learning to Tune a Class of Controllers with Deep Reinforcement Learning |
title_fullStr |
Learning to Tune a Class of Controllers with Deep Reinforcement Learning |
title_full_unstemmed |
Learning to Tune a Class of Controllers with Deep Reinforcement Learning |
title_sort |
learning to tune a class of controllers with deep reinforcement learning |
publisher |
MDPI AG |
series |
Minerals |
issn |
2075-163X |
publishDate |
2021-09-01 |
description |
Control systems require maintenance in the form of tuning their parameters in order to maximize their performance in the face of process changes in minerals processing circuits. This work focuses on using deep reinforcement learning to train an agent to perform this maintenance continuously. A generic simulation of a first-order process with a time delay, controlled by a proportional-integral controller, was used as the training environment. Domain randomization in this environment was used to aid in generalizing the agent to unseen conditions on a physical circuit. Proximal policy optimization was used to train the agent, and hyper-parameter optimization was performed to select the optimal agent neural network size and training algorithm parameters. Two agents were tested, examining the impact of the observation space used by the agent and concluding that the best observation consists of the parameters of an auto-regressive with exogenous input model fitted to the measurements of the controlled variable. The best trained agent was deployed at an industrial comminution circuit where it was tested on two flow rate control loops. This agent improved the performance of one of these control loops but decreased the performance of the other control loop. While deep reinforcement learning does show promise in controller tuning, several challenges and directions for further study have been identified. |
topic |
reinforcement learning deep neural network proportional-integral control |
url |
https://www.mdpi.com/2075-163X/11/9/989 |
work_keys_str_mv |
AT williamjohnshipman learningtotuneaclassofcontrollerswithdeepreinforcementlearning |
_version_ |
1716869948204122112 |