Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum Game

An optimal robust control solution for general nonlinear systems with unknown but observable dynamics is advanced here. The underlying Hamilton-Jacobi-Isaacs (HJI) equation of the corresponding zero-sum two-player game (ZS-TP-G) is learned using a Q-learning-based approach employing only input-outpu...

Full description

Bibliographic Details
Main Authors: Mircea-Bogdan Radac, Timotei Lala
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9268935/
id doaj-617dd7df8c9c4b3e92b4adfc4c90394a
record_format Article
spelling doaj-617dd7df8c9c4b3e92b4adfc4c90394a2021-03-30T03:52:11ZengIEEEIEEE Access2169-35362020-01-01821415321416510.1109/ACCESS.2020.30401859268935Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum GameMircea-Bogdan Radac0https://orcid.org/0000-0001-8410-6547Timotei Lala1Department of Automation and Applied Informatics, Politehnica University of Timisoara, Timisoara, RomaniaDepartment of Automation and Applied Informatics, Politehnica University of Timisoara, Timisoara, RomaniaAn optimal robust control solution for general nonlinear systems with unknown but observable dynamics is advanced here. The underlying Hamilton-Jacobi-Isaacs (HJI) equation of the corresponding zero-sum two-player game (ZS-TP-G) is learned using a Q-learning-based approach employing only input-output system measurements, assuming system observability. An equivalent virtual state-space model is built from the system's input-output samples and it is shown that controlling the former implies controlling the latter. Since the existence of a saddle-point solution to the ZS-TP-G is assumed unverifiable, the solution is derived in terms of upper-optimal and lower-optimal controllers. The learning convergence is theoretically ensured while practical implementation is performed using neural networks that provide scalability to the control problem dimension and automatic feature selection. The learning strategy is checked on an active suspension system, a good candidate for the robust control problem with respect to road profile disturbance rejection.https://ieeexplore.ieee.org/document/9268935/Active suspension systemapproximate dynamic programmingneural networksoptimal controlreinforcement learningstate feedback
collection DOAJ
language English
format Article
sources DOAJ
author Mircea-Bogdan Radac
Timotei Lala
spellingShingle Mircea-Bogdan Radac
Timotei Lala
Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum Game
IEEE Access
Active suspension system
approximate dynamic programming
neural networks
optimal control
reinforcement learning
state feedback
author_facet Mircea-Bogdan Radac
Timotei Lala
author_sort Mircea-Bogdan Radac
title Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum Game
title_short Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum Game
title_full Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum Game
title_fullStr Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum Game
title_full_unstemmed Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum Game
title_sort robust control of unknown observable nonlinear systems solved as a zero-sum game
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description An optimal robust control solution for general nonlinear systems with unknown but observable dynamics is advanced here. The underlying Hamilton-Jacobi-Isaacs (HJI) equation of the corresponding zero-sum two-player game (ZS-TP-G) is learned using a Q-learning-based approach employing only input-output system measurements, assuming system observability. An equivalent virtual state-space model is built from the system's input-output samples and it is shown that controlling the former implies controlling the latter. Since the existence of a saddle-point solution to the ZS-TP-G is assumed unverifiable, the solution is derived in terms of upper-optimal and lower-optimal controllers. The learning convergence is theoretically ensured while practical implementation is performed using neural networks that provide scalability to the control problem dimension and automatic feature selection. The learning strategy is checked on an active suspension system, a good candidate for the robust control problem with respect to road profile disturbance rejection.
topic Active suspension system
approximate dynamic programming
neural networks
optimal control
reinforcement learning
state feedback
url https://ieeexplore.ieee.org/document/9268935/
work_keys_str_mv AT mirceabogdanradac robustcontrolofunknownobservablenonlinearsystemssolvedasazerosumgame
AT timoteilala robustcontrolofunknownobservablenonlinearsystemssolvedasazerosumgame
_version_ 1724182744236294144