Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum Game
An optimal robust control solution for general nonlinear systems with unknown but observable dynamics is advanced here. The underlying Hamilton-Jacobi-Isaacs (HJI) equation of the corresponding zero-sum two-player game (ZS-TP-G) is learned using a Q-learning-based approach employing only input-outpu...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9268935/ |
id |
doaj-617dd7df8c9c4b3e92b4adfc4c90394a |
---|---|
record_format |
Article |
spelling |
doaj-617dd7df8c9c4b3e92b4adfc4c90394a2021-03-30T03:52:11ZengIEEEIEEE Access2169-35362020-01-01821415321416510.1109/ACCESS.2020.30401859268935Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum GameMircea-Bogdan Radac0https://orcid.org/0000-0001-8410-6547Timotei Lala1Department of Automation and Applied Informatics, Politehnica University of Timisoara, Timisoara, RomaniaDepartment of Automation and Applied Informatics, Politehnica University of Timisoara, Timisoara, RomaniaAn optimal robust control solution for general nonlinear systems with unknown but observable dynamics is advanced here. The underlying Hamilton-Jacobi-Isaacs (HJI) equation of the corresponding zero-sum two-player game (ZS-TP-G) is learned using a Q-learning-based approach employing only input-output system measurements, assuming system observability. An equivalent virtual state-space model is built from the system's input-output samples and it is shown that controlling the former implies controlling the latter. Since the existence of a saddle-point solution to the ZS-TP-G is assumed unverifiable, the solution is derived in terms of upper-optimal and lower-optimal controllers. The learning convergence is theoretically ensured while practical implementation is performed using neural networks that provide scalability to the control problem dimension and automatic feature selection. The learning strategy is checked on an active suspension system, a good candidate for the robust control problem with respect to road profile disturbance rejection.https://ieeexplore.ieee.org/document/9268935/Active suspension systemapproximate dynamic programmingneural networksoptimal controlreinforcement learningstate feedback |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Mircea-Bogdan Radac Timotei Lala |
spellingShingle |
Mircea-Bogdan Radac Timotei Lala Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum Game IEEE Access Active suspension system approximate dynamic programming neural networks optimal control reinforcement learning state feedback |
author_facet |
Mircea-Bogdan Radac Timotei Lala |
author_sort |
Mircea-Bogdan Radac |
title |
Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum Game |
title_short |
Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum Game |
title_full |
Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum Game |
title_fullStr |
Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum Game |
title_full_unstemmed |
Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum Game |
title_sort |
robust control of unknown observable nonlinear systems solved as a zero-sum game |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2020-01-01 |
description |
An optimal robust control solution for general nonlinear systems with unknown but observable dynamics is advanced here. The underlying Hamilton-Jacobi-Isaacs (HJI) equation of the corresponding zero-sum two-player game (ZS-TP-G) is learned using a Q-learning-based approach employing only input-output system measurements, assuming system observability. An equivalent virtual state-space model is built from the system's input-output samples and it is shown that controlling the former implies controlling the latter. Since the existence of a saddle-point solution to the ZS-TP-G is assumed unverifiable, the solution is derived in terms of upper-optimal and lower-optimal controllers. The learning convergence is theoretically ensured while practical implementation is performed using neural networks that provide scalability to the control problem dimension and automatic feature selection. The learning strategy is checked on an active suspension system, a good candidate for the robust control problem with respect to road profile disturbance rejection. |
topic |
Active suspension system approximate dynamic programming neural networks optimal control reinforcement learning state feedback |
url |
https://ieeexplore.ieee.org/document/9268935/ |
work_keys_str_mv |
AT mirceabogdanradac robustcontrolofunknownobservablenonlinearsystemssolvedasazerosumgame AT timoteilala robustcontrolofunknownobservablenonlinearsystemssolvedasazerosumgame |
_version_ |
1724182744236294144 |