Distributed and Higher-Order Graphical Models : towards Segmentation, Tracking, Matching and 3D Model Inference

This thesis is devoted to the development of graph-based methods that address several of the most fundamental computer vision problems, such as segmentation, tracking, shape matching and 3D model inference. The first contribution of this thesis is a unified, single-shot optimization framework for si...

Full description

Bibliographic Details
Main Author: Wang, Chaohui
Language:fra
Published: Ecole Centrale Paris 2011
Subjects:
Online Access:http://tel.archives-ouvertes.fr/tel-00658765
http://tel.archives-ouvertes.fr/docs/01/01/85/71/PDF/ThesisChaohuiWang.pdf
Description
Summary:This thesis is devoted to the development of graph-based methods that address several of the most fundamental computer vision problems, such as segmentation, tracking, shape matching and 3D model inference. The first contribution of this thesis is a unified, single-shot optimization framework for simultaneous segmentation, depth ordering and multi-object tracking from monocular video sequences using a pairwise Markov Random Field (MRF). This is achieved through a novel 2.5D layered model where object-level and pixel-level representations are seamlessly combined through local constraints. Towards introducing high-level knowledge, such as shape priors, we then studied the problem of non-rigid 3D surface matching. The second contribution of this thesis consists of a higher-order graph matching formulation that encodes various measurements of geometric/appearance similarities and intrinsic deformation errors. As the third contribution of this thesis, higher-order interactions were further considered to build pose-invariant statistical shape priors and were exploited for the development of a novel approach for knowledge-based 3D segmentation in medical imaging which is invariant to the global pose and the initialization of the shape model. The last contribution of this thesis aimed to partially address the influence of camera pose in visual perception. To this end, we introduced a unified paradigm for 3D landmark model inference from monocular 2D images to simultaneously determine both the optimal 3D model and the corresponding 2D projections without explicit estimation of the camera viewpoint, which is also able to deal with misdetections/occlusions