Closed-Loop Learning of Visual Control Policies

In this dissertation, I introduce a general, flexible framework for learning direct mappings from images to actions in an agent that interacts with its surrounding environment. This work is motivated by the paradigm of purposive vision. The original contributions consist in the design of reinforceme...

Full description

Bibliographic Details
Main Author:	Jodogne, Sébastien
Other Authors:	Charvillat, Vincent
Format:	Others
Published:	Universite de Liege 2006
Subjects:	Visual features Reinforcement learning Purposive vision
Online Access:	http://bictel.ulg.ac.be/ETD-db/collection/available/ULgetd-12072006-100239/

id	ndltd-BICfB-oai-ETDULg-ULgetd-12072006-100239
record_format	oai_dc
spelling	ndltd-BICfB-oai-ETDULg-ULgetd-12072006-1002392013-01-07T15:43:31Z Closed-Loop Learning of Visual Control Policies Jodogne, Sébastien Visual features Reinforcement learning Purposive vision In this dissertation, I introduce a general, flexible framework for learning direct mappings from images to actions in an agent that interacts with its surrounding environment. This work is motivated by the paradigm of purposive vision. The original contributions consist in the design of reinforcement learning algorithms that are applicable to visual spaces. Inspired by the paradigm of local-appearance vision, these algorithms exploit specialized visual features that can be detected in the visual signal. Two different ways to use the visual features are described. Firstly, I introduce adaptive-resolution methods for discretizing the visual space into a manageable number of perceptual classes. To this end, a percept classifier that tests the presence or absence of few highly informative visual features is incrementally refined. New discriminant visual features are selected in a sequence of attempts to remove perceptual aliasing. Any standard reinforcement learning algorithm can then be used to extract an optimal visual control policy. The resulting algorithm is called "Reinforcement Learning of Visual Classes." Secondly, I propose to exploit the raw content of the visual features, without ever considering an equivalence relation on the visual feature space. Technically, feature regression models that associate visual features with a real-valued utility are introduced within the Approximate Policy Iteration architecture. This is done by means of a general, abstract version of Approximate Policy Iteration. This results in the "Visual Approximate Policy Iteration" algorithm. Another major contribution of this dissertation is the design of adaptive-resolution techniques that can be applied to complex, high-dimensional and/or continuous action spaces, simultaneously to visual spaces. The "Reinforcement Learning of Joint Classes" algorithm produces a non-uniform discretization of the joint space of percepts and actions. This is a brand new, general approach to adaptive-resolution methods in reinforcement learning that can deal with arbitrary, hybrid state-action spaces. Throughout this dissertation, emphasis is also put on the design of general algorithms that can be used in non-visual (e.g. continuous) perceptual spaces. The applicability of the proposed algorithms is demonstrated by solving several visual navigation tasks. Charvillat, Vincent Munos, Remi Paletta, Lucas Verly, Jacques G. Wehenkel, Louis Piater, Justus H. Universite de Liege 2006-12-05 text application/pdf http://bictel.ulg.ac.be/ETD-db/collection/available/ULgetd-12072006-100239/ http://bictel.ulg.ac.be/ETD-db/collection/available/ULgetd-12072006-100239/ unrestricted Je certifie avoir complété et signé le contrat BICTEL/e remis par le gestionnaire facultaire.
collection	NDLTD
format	Others
sources	NDLTD
topic	Visual features Reinforcement learning Purposive vision
spellingShingle	Visual features Reinforcement learning Purposive vision Jodogne, Sébastien Closed-Loop Learning of Visual Control Policies
description	In this dissertation, I introduce a general, flexible framework for learning direct mappings from images to actions in an agent that interacts with its surrounding environment. This work is motivated by the paradigm of purposive vision. The original contributions consist in the design of reinforcement learning algorithms that are applicable to visual spaces. Inspired by the paradigm of local-appearance vision, these algorithms exploit specialized visual features that can be detected in the visual signal. Two different ways to use the visual features are described. Firstly, I introduce adaptive-resolution methods for discretizing the visual space into a manageable number of perceptual classes. To this end, a percept classifier that tests the presence or absence of few highly informative visual features is incrementally refined. New discriminant visual features are selected in a sequence of attempts to remove perceptual aliasing. Any standard reinforcement learning algorithm can then be used to extract an optimal visual control policy. The resulting algorithm is called "Reinforcement Learning of Visual Classes." Secondly, I propose to exploit the raw content of the visual features, without ever considering an equivalence relation on the visual feature space. Technically, feature regression models that associate visual features with a real-valued utility are introduced within the Approximate Policy Iteration architecture. This is done by means of a general, abstract version of Approximate Policy Iteration. This results in the "Visual Approximate Policy Iteration" algorithm. Another major contribution of this dissertation is the design of adaptive-resolution techniques that can be applied to complex, high-dimensional and/or continuous action spaces, simultaneously to visual spaces. The "Reinforcement Learning of Joint Classes" algorithm produces a non-uniform discretization of the joint space of percepts and actions. This is a brand new, general approach to adaptive-resolution methods in reinforcement learning that can deal with arbitrary, hybrid state-action spaces. Throughout this dissertation, emphasis is also put on the design of general algorithms that can be used in non-visual (e.g. continuous) perceptual spaces. The applicability of the proposed algorithms is demonstrated by solving several visual navigation tasks.
author2	Charvillat, Vincent
author_facet	Charvillat, Vincent Jodogne, Sébastien
author	Jodogne, Sébastien
author_sort	Jodogne, Sébastien
title	Closed-Loop Learning of Visual Control Policies
title_short	Closed-Loop Learning of Visual Control Policies
title_full	Closed-Loop Learning of Visual Control Policies
title_fullStr	Closed-Loop Learning of Visual Control Policies
title_full_unstemmed	Closed-Loop Learning of Visual Control Policies
title_sort	closed-loop learning of visual control policies
publisher	Universite de Liege
publishDate	2006
url	http://bictel.ulg.ac.be/ETD-db/collection/available/ULgetd-12072006-100239/
work_keys_str_mv	AT jodognesebastien closedlooplearningofvisualcontrolpolicies
_version_	1716393877303197696

Closed-Loop Learning of Visual Control Policies

Similar Items