Applications of Information Inequalities to Linear Systems : Adaptive Control and Security
This thesis considers the application of information inequalities, Cramér-Rao type bounds, based on Fisher information, to linear systems. These tools are used to study the trade-offs between learning and performance in two application areas: adaptive control and control systems security. In the fir...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
KTH, Reglerteknik
2021
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-291203 http://nbn-resolving.de/urn:isbn:978-91-7873-804-5 |
Summary: | This thesis considers the application of information inequalities, Cramér-Rao type bounds, based on Fisher information, to linear systems. These tools are used to study the trade-offs between learning and performance in two application areas: adaptive control and control systems security. In the first part of the thesis, we study stochastic adaptive control of linear quadratic regulators (LQR). Here, information inequalities are used to derive instance-dependent regret lower bounds. First, we consider a simplified version of LQR, a memoryless reference tracking model, and show how regret can be linked to a cumulative estimation error. This is then exploited to derive a regret lower bound in terms of the Fisher information generated by the experiment of the optimal policy. It is shown that if the optimal policy has ill-conditioned Fisher information, then so does any low-regret policy. This is combined with a Cramér-Rao bound to give a regret lower bound on the order of magnitude square-root T in the time-horizon for a class of instances we call uninformative. The lower bound holds for all policies which depend smoothly on the underlying parametrization. Second, we extend these results to the general LQR model, and to arbitrary affine parametrizations of the instance parameters. The notion of uninformativeness is generalized to this situation to give a structure-dependent rank condition for when logarithmic regret is impossible. This is done by reduction of regret to a cumulative Bellman error. Due to the quadratic nature of LQR, this Bellman error turns out to be a quadratic form, which again can be interpreted as an estimation error. Using this, we prove a local minimax regret lower bound, of which the proof relies on relating the minimax regret to a Bayesian estimation problem, and then using Van Trees' inequality. Again, it is shown that an appropriate information quantity of any low regret policy is similar to that of the optimal policy and that any uninformative instance suffers local minimax regret at least on the order of magnitude square-root T. Moreover, it shown that the notion of uninformativeness when specialized to certain well-understood scenarios yields a tight characterization of square-root-regret. In the second part of this thesis, we study control systems security problems from a Fisher information point of view. First, we consider a secure state estimation problem and characterize the maximal impact an adversary can cause by means of least informative distributions -- those which maximize the Cramér-Rao bound. For a linear measurement equation, it is shown that the least informative distribution, subjected to variance and sparsity constraints, can be solved for by a semi-definite program, which becomes mixed-integer in the presence of sparsity constraints. Furthermore, by relying on well-known results on minimax and robust estimation, a game-theoretic interpretation for this characterization of the maximum impact is offered. Last, we consider a Fisher information regularized minimum variance control objective, to study the trade-offs between parameter privacy and control performance. It is noted that this can be motivated for instance by learning-based attacks, in which case one seeks to leak as little information as possible to a system-identification adversary. Supposing that the feedback law is linear, the noise distribution minimizing the trace of Fisher information subject to a state variance penalty is found to be conditionally Gaussian. === <p>QC 20210310</p><p>QC 20210310</p> |
---|