Compact Representations and Multi-cue Integration for Robotics

This thesis presents methods useful in a bin picking application, such as detection and representation of local features, pose estimation and multi-cue integration. The scene tensor is a representation of multiple line or edge segments and was first introduced by Nordberg in [30]. A method for estim...

Full description

Bibliographic Details
Main Author: Söderberg, Robert
Format: Others
Language:English
Published: Linköpings universitet, Bildbehandling 2005
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-5574
http://nbn-resolving.de/urn:isbn:91-85299-36-7
Description
Summary:This thesis presents methods useful in a bin picking application, such as detection and representation of local features, pose estimation and multi-cue integration. The scene tensor is a representation of multiple line or edge segments and was first introduced by Nordberg in [30]. A method for estimating scene tensors from gray-scale images is presented. The method is based on orientation tensors, where the scene tensor can be estimated by correlations of the elements in the orientation tensor with a number of 1D filters. Mechanisms for analyzing the scene tensor are described and an algorithm for detecting interest points and estimating feature parameters is presented. It is shown that the algorithm works on a wide spectrum of images with good result. Representations that are invariant with respect to a set of transformations are useful in many applications, such as pose estimation, tracking and wide baseline stereo. The scene tensor itself is not invariant and three different methods for implementing an invariant representation based on the scene tensor is presented. One is based on a non-linear transformation of the scene tensor and is invariant to perspective transformations. Two versions of a tensor doublet is presented, which is based on a geometry of two interest points and is invariant to translation, rotation and scaling. The tensor doublet is used in a framework for view centered pose estimation of 3D objects. It is shown that the pose estimation algorithm has good performance even though the object is occluded and has a different scale compared to the training situation. An industrial implementation of a bin picking application have to cope with several different types of objects. All pose estimation algorithms use some kind of model and there is yet no model that can cope with all kinds of situations and objects. This thesis presents a method for integrating cues from several pose estimation algorithms for increasing the system stability. It is also shown that the same framework can also be used for increasing the accuracy of the system by using cues from several views of the object. An extensive test with several different objects, lighting conditions and backgrounds shows that multi-cue integration makes the system more robust and increases the accuracy. Finally, a system for bin picking is presented, built from the previous parts of this thesis. An eye in hand setup is used with a standard industrial robot arm. It is shown that the system works for real bin-picking situations with a positioning error below 1 mm and an orientation error below 1o degree for most of the different situations. === Report code: LiU-TEK-LIC-2005:15.