A saliency based framework for multi-modal registration

In recent years the Digital Film Production process has seen a huge increase in the amount of data captured, resulting in the need for automated tools within the pipeline. In particular, it typically involves the capture of multi-modal data such as 3D Light Detection And Ranging (LiDAR) scans, 2D im...

Full description

Bibliographic Details
Main Author: Brown, Mark R.
Other Authors: Guillemaut, J.-Y
Published: University of Surrey 2016
Subjects:
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.701568
Description
Summary:In recent years the Digital Film Production process has seen a huge increase in the amount of data captured, resulting in the need for automated tools within the pipeline. In particular, it typically involves the capture of multi-modal data such as 3D Light Detection And Ranging (LiDAR) scans, 2D images and videos, whose alignment and registration provide valuable information within the production process. There are significant challenges posed in this particular multi-modal registration problem that are not faced in the majority of feature-based registration pipelines. In particular, many existing feature detectors make modality-specific assumptions about the attributes a good, repeatable feature should possess, and as a result cannot be applied in a general, multi-modal manner. To combat this we take a saliency-based approach to feature detection that may be more meaningfully applied across modalities than other feature detectors. Furthermore, by extracting only the most salient features of a scene, significantly fewer features are obtained, resulting in a lower computational cost for the registration process. The first contribution of this thesis is a generalisation of the Kadir-Brady salient point detector. The generalisation allows for both a more robust alternative for 2D images, and a 3D extension, where in particular it may operate on both the geometry and texture of the scene. As a result, it allows for more meaningful multi-modal feature detection, and higher repeatability results are observed when compared to existing 2D-3D point feature detectors. The second contribution is the proposal of a novel salient line segment detector. By explicitly accounting for the surroundings of a line, the approach naturally avoids repetitive parts of a scene while detecting the strong, discriminative lines present. Its general, histogram-based framework allows for a natural extension to depth imagery and 3D, where lines are detected based jointly on both texture and geometry. The final contribution is centred around the registration phase, where a globally optimal solution to 2D-3D registration from points or lines based on a Branch-and-Bound (BnB) approach is proposed. Novel search procedures are proposed to speed up the algorithm, taking advantage of the special nested BnB structure used. The optimality properties of the proposed approach allow 2D-3D registration to be achieved for significantly higher rates of outliers compared to existing approaches.