Feedback-motion-planning with simulation-based LQR-trees

The paper presents the simulation-based variant of the LQR-tree feedback-motion-planning approach. The algorithm generates a control policy that stabilizes a nonlinear dynamic system from a bounded set of initial conditions to a goal. This policy is represented by a tree of feedback-stabilized traje...

Full description

Bibliographic Details
Main Authors: Reist, Philipp (Author), Preiswerk, Pascal (Author), Tedrake, Russell L (Author)
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory (Contributor)
Format: Article
Language:English
Published: SAGE Publications, 2020-03-26T13:56:13Z.
Subjects:
Online Access:Get fulltext
Description
Summary:The paper presents the simulation-based variant of the LQR-tree feedback-motion-planning approach. The algorithm generates a control policy that stabilizes a nonlinear dynamic system from a bounded set of initial conditions to a goal. This policy is represented by a tree of feedback-stabilized trajectories. The algorithm explores the bounded set with random state samples and, where needed, adds new trajectories to the tree using motion planning. Simultaneously, the algorithm approximates the funnel of a trajectory, which is the set of states that can be stabilized to the goal by the trajectory's feedback policy. Generating a control policy that stabilizes the bounded set to the goal is equivalent to adding trajectories to the tree until their funnels cover the set. In previous work, funnels are approximated with sums-of-squares verification. Here, funnels are approximated by sampling and falsification by simulation, which allows the application to a broader range of systems and a straightforward enforcement of input and state constraints. A theoretical analysis shows that, in the long run, the algorithm tends to improve the coverage of the bounded set as well as the funnel approximations. Focusing on the practical application of the method, a detailed example implementation is given that is used to generate policies for two example systems. Simulation results support the theoretical findings, while experiments demonstrate the algorithm's state-constraints capability, and applicability to highly-dynamic systems. Keywords: Feedback motion-planning; random sampling; feedback policy; nonlinear dynamic system; trajectory library
ETH (Research Grant ETH-31 11-1)