Closed-Loop Learning of Visual Control Policies
In this dissertation, I introduce a general, flexible framework for learning direct mappings from images to actions in an agent that interacts with its surrounding environment. This work is motivated by the paradigm of purposive vision. The original contributions consist in the design of reinforceme...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Published: |
Universite de Liege
2006
|
Subjects: | |
Online Access: | http://bictel.ulg.ac.be/ETD-db/collection/available/ULgetd-12072006-100239/ |
id |
ndltd-BICfB-oai-ETDULg-ULgetd-12072006-100239 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-BICfB-oai-ETDULg-ULgetd-12072006-1002392013-01-07T15:43:31Z Closed-Loop Learning of Visual Control Policies Jodogne, Sébastien Visual features Reinforcement learning Purposive vision In this dissertation, I introduce a general, flexible framework for learning direct mappings from images to actions in an agent that interacts with its surrounding environment. This work is motivated by the paradigm of purposive vision. The original contributions consist in the design of reinforcement learning algorithms that are applicable to visual spaces. Inspired by the paradigm of local-appearance vision, these algorithms exploit specialized visual features that can be detected in the visual signal. Two different ways to use the visual features are described. Firstly, I introduce adaptive-resolution methods for discretizing the visual space into a manageable number of perceptual classes. To this end, a percept classifier that tests the presence or absence of few highly informative visual features is incrementally refined. New discriminant visual features are selected in a sequence of attempts to remove perceptual aliasing. Any standard reinforcement learning algorithm can then be used to extract an optimal visual control policy. The resulting algorithm is called "Reinforcement Learning of Visual Classes." Secondly, I propose to exploit the raw content of the visual features, without ever considering an equivalence relation on the visual feature space. Technically, feature regression models that associate visual features with a real-valued utility are introduced within the Approximate Policy Iteration architecture. This is done by means of a general, abstract version of Approximate Policy Iteration. This results in the "Visual Approximate Policy Iteration" algorithm. Another major contribution of this dissertation is the design of adaptive-resolution techniques that can be applied to complex, high-dimensional and/or continuous action spaces, simultaneously to visual spaces. The "Reinforcement Learning of Joint Classes" algorithm produces a non-uniform discretization of the joint space of percepts and actions. This is a brand new, general approach to adaptive-resolution methods in reinforcement learning that can deal with arbitrary, hybrid state-action spaces. Throughout this dissertation, emphasis is also put on the design of general algorithms that can be used in non-visual (e.g. continuous) perceptual spaces. The applicability of the proposed algorithms is demonstrated by solving several visual navigation tasks. Charvillat, Vincent Munos, Remi Paletta, Lucas Verly, Jacques G. Wehenkel, Louis Piater, Justus H. Universite de Liege 2006-12-05 text application/pdf http://bictel.ulg.ac.be/ETD-db/collection/available/ULgetd-12072006-100239/ http://bictel.ulg.ac.be/ETD-db/collection/available/ULgetd-12072006-100239/ unrestricted Je certifie avoir complété et signé le contrat BICTEL/e remis par le gestionnaire facultaire. |
collection |
NDLTD |
format |
Others
|
sources |
NDLTD |
topic |
Visual features Reinforcement learning Purposive vision |
spellingShingle |
Visual features Reinforcement learning Purposive vision Jodogne, Sébastien Closed-Loop Learning of Visual Control Policies |
description |
In this dissertation, I introduce a general, flexible framework for learning direct mappings from images to actions in an agent that interacts with its surrounding environment. This work is motivated by the paradigm of purposive vision. The original contributions consist in the design of reinforcement learning algorithms that are applicable to visual spaces. Inspired by the paradigm of local-appearance vision, these algorithms exploit specialized visual features that can be detected in the visual signal.
Two different ways to use the visual features are described. Firstly, I introduce adaptive-resolution methods for discretizing the visual space into a manageable number of perceptual classes. To this end, a percept classifier that tests the presence or absence of few highly informative visual features is incrementally refined. New discriminant visual features are selected in a sequence of attempts to remove perceptual aliasing. Any standard reinforcement learning algorithm can then be used to extract an optimal visual control policy. The resulting algorithm is called "Reinforcement Learning of Visual Classes." Secondly, I propose to exploit the raw content of the visual features, without ever considering an equivalence relation on the visual feature space. Technically, feature regression models that associate visual features with a real-valued utility are introduced within the Approximate Policy Iteration architecture. This is done by means of a general, abstract version of Approximate Policy Iteration. This results in the "Visual Approximate Policy Iteration" algorithm.
Another major contribution of this dissertation is the design of adaptive-resolution techniques that can be applied to complex, high-dimensional and/or continuous action spaces, simultaneously to visual spaces. The "Reinforcement Learning of Joint Classes" algorithm produces a non-uniform discretization of the joint space of percepts and actions. This is a brand new, general approach to adaptive-resolution methods in reinforcement learning that can deal with arbitrary, hybrid state-action spaces.
Throughout this dissertation, emphasis is also put on the design of general algorithms that can be used in non-visual (e.g. continuous) perceptual spaces. The applicability of the proposed algorithms is demonstrated by solving several visual navigation tasks. |
author2 |
Charvillat, Vincent |
author_facet |
Charvillat, Vincent Jodogne, Sébastien |
author |
Jodogne, Sébastien |
author_sort |
Jodogne, Sébastien |
title |
Closed-Loop Learning of Visual Control Policies |
title_short |
Closed-Loop Learning of Visual Control Policies |
title_full |
Closed-Loop Learning of Visual Control Policies |
title_fullStr |
Closed-Loop Learning of Visual Control Policies |
title_full_unstemmed |
Closed-Loop Learning of Visual Control Policies |
title_sort |
closed-loop learning of visual control policies |
publisher |
Universite de Liege |
publishDate |
2006 |
url |
http://bictel.ulg.ac.be/ETD-db/collection/available/ULgetd-12072006-100239/ |
work_keys_str_mv |
AT jodognesebastien closedlooplearningofvisualcontrolpolicies |
_version_ |
1716393877303197696 |