Summary: | Visual motion is one of the most important cues for the interpretation of image sequences. A contiguous region whose motion can be characterised by a single set of parameters is very likely to correspond to a distinct physical object. Knowledge of the motion and spatial extent of such regions can greatly enhance the efficiency of applications such as image sequence coding and video restoration. The first part of this dissertation analyses existing motion estimation techniques in terms of the extent to which they are able to recover true motion, to identify uncovered background (those areas of an image that were not visible in the preceding image), and to model abrupt motion discontinuities. A novel motion estimation technique is presented that performs well according to these criteria. It is argued that motion estimation alone is insufficient for future applications because estimation is performed at the wrong level of abstraction: the object view is missing. The focus in the latter part of this dissertation is on techniques that treat motion estimation and segmentation as equal and integral parts of the estimation process. A model, known as the <I>layer model</I>, is introduced, which explicitly represents images as the superposition of a number of underlying objects or layers. Research interest has naturally focused on the development of techniques that enable arbitrary image sequences to be efficiently decomposed into their constituent layers. A promising approach, proposed in the literature, is to model the distribution of motion within an image as a probabilistic mixture, the parameters of which may be recovered by the <I>Expectation-Maximisation (EM)</I> algorithm. Extensions to this approach are proposed that encourage spatial coherence and allow more precise modelling of uncovering and occlusion. A number of issues are proposed for further research, notably extending the technique to more accurately model non-rigid motion and to incorporate more sophisticated layer model order selection.
|