LEARNING A VISUALFORWARD MODEL

Abstract Internal forward models are aimed to provide the system with the prediction of changes in sensory observations as the consequent of its own actions. For the special case where the sensed information is in the form of the camera images, the model is called visual forward model. Images are on...

Full description

Bibliographic Details
Main Author: GHADIRZADEH, ALI
Format: Others
Language:English
Published: KTH, Skolan för datavetenskap och kommunikation (CSC) 2013
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-142034
Description
Summary:Abstract Internal forward models are aimed to provide the system with the prediction of changes in sensory observations as the consequent of its own actions. For the special case where the sensed information is in the form of the camera images, the model is called visual forward model. Images are one of the richest resources of data and the ability to predict the sensory camera images, enables the robots to do more autonomous and intelligent tasks. Most of actions performed by robots lead to outcomes which are appearing in the vision system. Therefor the capability to predict these outcomes in the form of images helps the robot to execute better long- term plans. That is why the visual forward models are of particular importance. The main challenges regarding the construction of the visual forward models are the high amount of image data to be predicted and the degrees of freedom of the robot's action which causes the complexities to grow rapidly. In this work, we have investigated dierent methods to construct the visual forward models for a robotic camera head setup. The forward model explores the contin- gencies between the movements in the robot's neck and eye joints and the resulting changes in the camera images. Four dierent methods to construct the visual for- ward models are introduced and implemented. Learning of the forward models in these methods is based on linear interpolation, radial basis function networks or Gaussian processes given the correspondences between the successive frames ex- tracted by the use of SURF descriptors or constructions of so-called cumulator units. To examine the performance of the proposed methods, two dierent types of experiments are designed with the dierence that in the rst experiment, depth information is not relevant while in the second one it is. Our experimental results show the success of the introduced methods in the construction of the visual forward models also provide the weak and strong aspects of each method.