A pipelined framework for multi-scale image comparison

In this thesis we present a pipelined framework and a family of algorithms for determining meaningful differences between two images. The pipeline is structured in a way that makes it easy to use only a portion of the pipeline when that is appropriate, and any module in the pipeline can be replaced...

Full description

Bibliographic Details
Main Author: Martindale, David M.
Language:English
Published: 2009
Online Access:http://hdl.handle.net/2429/14881
Description
Summary:In this thesis we present a pipelined framework and a family of algorithms for determining meaningful differences between two images. The pipeline is structured in a way that makes it easy to use only a portion of the pipeline when that is appropriate, and any module in the pipeline can be replaced in its entirety by a new module that uses a different algorithm, while requiring minimal or no changes to other modules. Many pieces of the pipeline can be used individually. Three key design goals were to facilitate comparison of real images to computer-generated images, to take into account the sensitivity of the human visual system, and to accept pairs of images with arbitrary size differences and moderate misalignment in translation, rotation, and field of view. Most existing approaches fail badly at this task, requiring the two images to be of the same size and perfectly aligned with each other. The pipeline automatically aligns the two images as well as it can in the first four of its five phases. Images are first separated into octave-spaced bands of spatial frequencies, using the wavelet-based technique of Mallat and Zhong to find edges at multiple scales in each of the images. These edges are then matched using a graph-theoretic matching technique in the second phase, a least-squares fit is found for the alignment in the third phase, and the resulting transformation is applied using appropriate resampling in the fourth phase. The final phase of the pipeline computes difference measures for each spatial frequency band and three colour components, weighted according to the human contrast sensitivity function. The same Mallat-Zhong wavelet transform used for edge detection is used as the multi-band filter during comparison. We describe the pipeline and each component in detail, and discuss alternate approaches for each of the phases and generalizations of the framework. The flexibility of the framework is demonstrated by examples. We show how to restructure the data flow to implement an improved iterative multiscale alignment technique, and we analyze a Laplacian-based edge detector that could be used instead of wavelets.