Closed-Loop Learning of Visual Control Policies

In this dissertation, I introduce a general, flexible framework for learning direct mappings from images to actions in an agent that interacts with its surrounding environment. This work is motivated by the paradigm of purposive vision. The original contributions consist in the design of reinforceme...

Full description

Bibliographic Details
Main Author: Jodogne, Sébastien
Other Authors: Charvillat, Vincent
Format: Others
Published: Universite de Liege 2006
Subjects:
Online Access:http://bictel.ulg.ac.be/ETD-db/collection/available/ULgetd-12072006-100239/
id ndltd-BICfB-oai-ETDULg-ULgetd-12072006-100239
record_format oai_dc
spelling ndltd-BICfB-oai-ETDULg-ULgetd-12072006-1002392013-01-07T15:43:31Z Closed-Loop Learning of Visual Control Policies Jodogne, Sébastien Visual features Reinforcement learning Purposive vision In this dissertation, I introduce a general, flexible framework for learning direct mappings from images to actions in an agent that interacts with its surrounding environment. This work is motivated by the paradigm of purposive vision. The original contributions consist in the design of reinforcement learning algorithms that are applicable to visual spaces. Inspired by the paradigm of local-appearance vision, these algorithms exploit specialized visual features that can be detected in the visual signal. Two different ways to use the visual features are described. Firstly, I introduce adaptive-resolution methods for discretizing the visual space into a manageable number of perceptual classes. To this end, a percept classifier that tests the presence or absence of few highly informative visual features is incrementally refined. New discriminant visual features are selected in a sequence of attempts to remove perceptual aliasing. Any standard reinforcement learning algorithm can then be used to extract an optimal visual control policy. The resulting algorithm is called "Reinforcement Learning of Visual Classes." Secondly, I propose to exploit the raw content of the visual features, without ever considering an equivalence relation on the visual feature space. Technically, feature regression models that associate visual features with a real-valued utility are introduced within the Approximate Policy Iteration architecture. This is done by means of a general, abstract version of Approximate Policy Iteration. This results in the "Visual Approximate Policy Iteration" algorithm. Another major contribution of this dissertation is the design of adaptive-resolution techniques that can be applied to complex, high-dimensional and/or continuous action spaces, simultaneously to visual spaces. The "Reinforcement Learning of Joint Classes" algorithm produces a non-uniform discretization of the joint space of percepts and actions. This is a brand new, general approach to adaptive-resolution methods in reinforcement learning that can deal with arbitrary, hybrid state-action spaces. Throughout this dissertation, emphasis is also put on the design of general algorithms that can be used in non-visual (e.g. continuous) perceptual spaces. The applicability of the proposed algorithms is demonstrated by solving several visual navigation tasks. Charvillat, Vincent Munos, Remi Paletta, Lucas Verly, Jacques G. Wehenkel, Louis Piater, Justus H. Universite de Liege 2006-12-05 text application/pdf http://bictel.ulg.ac.be/ETD-db/collection/available/ULgetd-12072006-100239/ http://bictel.ulg.ac.be/ETD-db/collection/available/ULgetd-12072006-100239/ unrestricted Je certifie avoir complété et signé le contrat BICTEL/e remis par le gestionnaire facultaire.
collection NDLTD
format Others
sources NDLTD
topic Visual features
Reinforcement learning
Purposive vision
spellingShingle Visual features
Reinforcement learning
Purposive vision
Jodogne, Sébastien
Closed-Loop Learning of Visual Control Policies
description In this dissertation, I introduce a general, flexible framework for learning direct mappings from images to actions in an agent that interacts with its surrounding environment. This work is motivated by the paradigm of purposive vision. The original contributions consist in the design of reinforcement learning algorithms that are applicable to visual spaces. Inspired by the paradigm of local-appearance vision, these algorithms exploit specialized visual features that can be detected in the visual signal. Two different ways to use the visual features are described. Firstly, I introduce adaptive-resolution methods for discretizing the visual space into a manageable number of perceptual classes. To this end, a percept classifier that tests the presence or absence of few highly informative visual features is incrementally refined. New discriminant visual features are selected in a sequence of attempts to remove perceptual aliasing. Any standard reinforcement learning algorithm can then be used to extract an optimal visual control policy. The resulting algorithm is called "Reinforcement Learning of Visual Classes." Secondly, I propose to exploit the raw content of the visual features, without ever considering an equivalence relation on the visual feature space. Technically, feature regression models that associate visual features with a real-valued utility are introduced within the Approximate Policy Iteration architecture. This is done by means of a general, abstract version of Approximate Policy Iteration. This results in the "Visual Approximate Policy Iteration" algorithm. Another major contribution of this dissertation is the design of adaptive-resolution techniques that can be applied to complex, high-dimensional and/or continuous action spaces, simultaneously to visual spaces. The "Reinforcement Learning of Joint Classes" algorithm produces a non-uniform discretization of the joint space of percepts and actions. This is a brand new, general approach to adaptive-resolution methods in reinforcement learning that can deal with arbitrary, hybrid state-action spaces. Throughout this dissertation, emphasis is also put on the design of general algorithms that can be used in non-visual (e.g. continuous) perceptual spaces. The applicability of the proposed algorithms is demonstrated by solving several visual navigation tasks.
author2 Charvillat, Vincent
author_facet Charvillat, Vincent
Jodogne, Sébastien
author Jodogne, Sébastien
author_sort Jodogne, Sébastien
title Closed-Loop Learning of Visual Control Policies
title_short Closed-Loop Learning of Visual Control Policies
title_full Closed-Loop Learning of Visual Control Policies
title_fullStr Closed-Loop Learning of Visual Control Policies
title_full_unstemmed Closed-Loop Learning of Visual Control Policies
title_sort closed-loop learning of visual control policies
publisher Universite de Liege
publishDate 2006
url http://bictel.ulg.ac.be/ETD-db/collection/available/ULgetd-12072006-100239/
work_keys_str_mv AT jodognesebastien closedlooplearningofvisualcontrolpolicies
_version_ 1716393877303197696