MEASURING VISUAL MOTION FROM IMAGE SEQUENCES

Motion is an important and fundamental source of visual information. It is well known that the pattern of image motion contains information useful for the determination of the 3-dimensional structure of the environment and the relative motion between the camera and the objects in the scene. However,...

Full description

Bibliographic Details
Main Author: ANANDAN, PADMANABHAN
Language:ENG
Published: ScholarWorks@UMass Amherst 1987
Subjects:
Online Access:https://scholarworks.umass.edu/dissertations/AAI8727015
Description
Summary:Motion is an important and fundamental source of visual information. It is well known that the pattern of image motion contains information useful for the determination of the 3-dimensional structure of the environment and the relative motion between the camera and the objects in the scene. However, the accurate measurement of image motion from a sequence of real images has proven to be difficult. In this thesis, a hierarchical framework for the computation of dense displacement fields from pairs of images, and an integrated system consistent with that framework are described. Each input intensity image is first decomposed using a set of spatial-frequency tuned channels. The information in the low-frequency channels is used to provide rough displacements over a large range, which are then successively refined by using the information in the higher-frequency channels. Within each channel, a direction-dependent confidence measure is computed for each displacement vector, and a smoothness constraint is used to propagate reliable displacement vectors to their neighboring areas with less reliable vectors. For our integrated system, Burt's Laplacian pyramid transform is used for the spatial-frequency decomposition, and the minimization of the sum of squared differences measure (SSD) is used as the match criterion. The confidence measure is derived from the shape of the SSD surface, and the smoothness constraint is formulated as a functional minimization problem. Results of applying our system to several image-pairs containing complex camera motion as well as independently moving objects are included. A number of well-known gradient-based and matching techniques are also shown to be consistent with our framework. The mathematical relationship between the gradient-based techniques and a class of correlation techniques is established. This thesis also includes several proposals for extending our approach for multiple-frame analysis. Of particular interest is an approach which involves the decomposition of the input images according to orientation as well as scale. This new approach unifies the spatio-temporal energy models, which are currently popular in psychophysics, with the gradient-based and the matching techniques, and appears biologically feasible, and ideally suited for connectionist models of computation.