Efficient data structures for piecewise-smooth video processing

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011. === Cataloged from PDF version of thesis. === Includes bibliographical references (p. 95-102). === A number of useful image and video processing techniques, ranging from low level oper...

Full description

Bibliographic Details
Main Author: Chen, Jiawen (Jiawen Kevin)
Other Authors: Frédo Durand.
Format: Others
Language:English
Published: Massachusetts Institute of Technology 2011
Subjects:
Online Access:http://hdl.handle.net/1721.1/66003
id ndltd-MIT-oai-dspace.mit.edu-1721.1-66003
record_format oai_dc
spelling ndltd-MIT-oai-dspace.mit.edu-1721.1-660032019-05-02T15:37:37Z Efficient data structures for piecewise-smooth video processing Chen, Jiawen (Jiawen Kevin) Frédo Durand. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011. Cataloged from PDF version of thesis. Includes bibliographical references (p. 95-102). A number of useful image and video processing techniques, ranging from low level operations such as denoising and detail enhancement to higher level methods such as object manipulation and special effects, rely on piecewise-smooth functions computed from the input data. In this thesis, we present two computationally efficient data structures for representing piecewise-smooth visual information and demonstrate how they can dramatically simplify and accelerate a variety of video processing algorithms. We start by introducing the bilateral grid, an image representation that explicitly accounts for intensity edges. By interpreting brightness values as Euclidean coordinates, the bilateral grid enables simple expressions for edge-aware filters. Smooth functions defined on the bilateral grid are piecewise-smooth in image space. Within this framework, we derive efficient reinterpretations of a number of edge-aware filters commonly used in computational photography as operations on the bilateral grid, including the bilateral filter, edgeaware scattered data interpolation, and local histogram equalization. We also show how these techniques can be easily parallelized onto modern graphics hardware for real-time processing of high definition video. The second data structure we introduce is the video mesh, designed as a flexible central data structure for general-purpose video editing. It represents objects in a video sequence as 2.5D "paper cutouts" and allows interactive editing of moving objects and modeling of depth, which enables 3D effects and post-exposure camera control. In our representation, we assume that motion and depth are piecewise-smooth, and encode them sparsely as a set of points tracked over time. The video mesh is a triangulation over this point set and per-pixel information is obtained by interpolation. To handle occlusions and detailed object boundaries, we rely on the user to rotoscope the scene at a sparse set of frames using spline curves. We introduce an algorithm to robustly and automatically cut the mesh into local layers with proper occlusion topology, and propagate the splines to the remaining frames. Object boundaries are refined with per-pixel alpha mattes. At its core, the video mesh is a collection of texture-mapped triangles, which we can edit and render interactively using graphics hardware. We demonstrate the effectiveness of our representation with special effects such as 3D viewpoint changes, object insertion, depthof- field manipulation, and 2D to 3D video conversion. by Jiawen Chen. Ph.D. 2011-09-27T18:31:08Z 2011-09-27T18:31:08Z 2011 2011 Thesis http://hdl.handle.net/1721.1/66003 751844425 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 102 p. application/pdf Massachusetts Institute of Technology
collection NDLTD
language English
format Others
sources NDLTD
topic Electrical Engineering and Computer Science.
spellingShingle Electrical Engineering and Computer Science.
Chen, Jiawen (Jiawen Kevin)
Efficient data structures for piecewise-smooth video processing
description Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011. === Cataloged from PDF version of thesis. === Includes bibliographical references (p. 95-102). === A number of useful image and video processing techniques, ranging from low level operations such as denoising and detail enhancement to higher level methods such as object manipulation and special effects, rely on piecewise-smooth functions computed from the input data. In this thesis, we present two computationally efficient data structures for representing piecewise-smooth visual information and demonstrate how they can dramatically simplify and accelerate a variety of video processing algorithms. We start by introducing the bilateral grid, an image representation that explicitly accounts for intensity edges. By interpreting brightness values as Euclidean coordinates, the bilateral grid enables simple expressions for edge-aware filters. Smooth functions defined on the bilateral grid are piecewise-smooth in image space. Within this framework, we derive efficient reinterpretations of a number of edge-aware filters commonly used in computational photography as operations on the bilateral grid, including the bilateral filter, edgeaware scattered data interpolation, and local histogram equalization. We also show how these techniques can be easily parallelized onto modern graphics hardware for real-time processing of high definition video. The second data structure we introduce is the video mesh, designed as a flexible central data structure for general-purpose video editing. It represents objects in a video sequence as 2.5D "paper cutouts" and allows interactive editing of moving objects and modeling of depth, which enables 3D effects and post-exposure camera control. In our representation, we assume that motion and depth are piecewise-smooth, and encode them sparsely as a set of points tracked over time. The video mesh is a triangulation over this point set and per-pixel information is obtained by interpolation. To handle occlusions and detailed object boundaries, we rely on the user to rotoscope the scene at a sparse set of frames using spline curves. We introduce an algorithm to robustly and automatically cut the mesh into local layers with proper occlusion topology, and propagate the splines to the remaining frames. Object boundaries are refined with per-pixel alpha mattes. At its core, the video mesh is a collection of texture-mapped triangles, which we can edit and render interactively using graphics hardware. We demonstrate the effectiveness of our representation with special effects such as 3D viewpoint changes, object insertion, depthof- field manipulation, and 2D to 3D video conversion. === by Jiawen Chen. === Ph.D.
author2 Frédo Durand.
author_facet Frédo Durand.
Chen, Jiawen (Jiawen Kevin)
author Chen, Jiawen (Jiawen Kevin)
author_sort Chen, Jiawen (Jiawen Kevin)
title Efficient data structures for piecewise-smooth video processing
title_short Efficient data structures for piecewise-smooth video processing
title_full Efficient data structures for piecewise-smooth video processing
title_fullStr Efficient data structures for piecewise-smooth video processing
title_full_unstemmed Efficient data structures for piecewise-smooth video processing
title_sort efficient data structures for piecewise-smooth video processing
publisher Massachusetts Institute of Technology
publishDate 2011
url http://hdl.handle.net/1721.1/66003
work_keys_str_mv AT chenjiawenjiawenkevin efficientdatastructuresforpiecewisesmoothvideoprocessing
_version_ 1719025030633881600