Summary: | An optimal robust control solution for general nonlinear systems with unknown but observable dynamics is advanced here. The underlying Hamilton-Jacobi-Isaacs (HJI) equation of the corresponding zero-sum two-player game (ZS-TP-G) is learned using a Q-learning-based approach employing only input-output system measurements, assuming system observability. An equivalent virtual state-space model is built from the system's input-output samples and it is shown that controlling the former implies controlling the latter. Since the existence of a saddle-point solution to the ZS-TP-G is assumed unverifiable, the solution is derived in terms of upper-optimal and lower-optimal controllers. The learning convergence is theoretically ensured while practical implementation is performed using neural networks that provide scalability to the control problem dimension and automatic feature selection. The learning strategy is checked on an active suspension system, a good candidate for the robust control problem with respect to road profile disturbance rejection.
|