Visual Deprojection: Probabilistic Recovery of Collapsed Dimensions

We introduce visual deprojection: The task of recovering an image or video that has been collapsed along a dimension. Projections arise in various contexts, such as long-exposure photography, where a dynamic scene is collapsed in time to produce a motion-blurred image, and corner cameras, where refl...

Full description

Bibliographic Details
Main Authors: Balakrishnan, Guha (Author), Dalca, Adrian Vasile (Author), Zhao, Amy (Xiaoyu Amy) (Author), Guttag, John V (Author), Durand, Frédo (Author), Freeman, William T (Author)
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory (Contributor), Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Contributor)
Format: Article
Language:English
Published: IEEE, 2021-02-19T15:02:36Z.
Subjects:
Online Access:Get fulltext
LEADER 02432 am a22002653u 4500
001 129833
042 |a dc 
100 1 0 |a Balakrishnan, Guha  |e author 
100 1 0 |a Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory  |e contributor 
100 1 0 |a Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science  |e contributor 
700 1 0 |a Dalca, Adrian Vasile  |e author 
700 1 0 |a Zhao, Amy   |q  (Xiaoyu Amy)   |e author 
700 1 0 |a Guttag, John V  |e author 
700 1 0 |a Durand, Frédo  |e author 
700 1 0 |a Freeman, William T  |e author 
245 0 0 |a Visual Deprojection: Probabilistic Recovery of Collapsed Dimensions 
260 |b IEEE,   |c 2021-02-19T15:02:36Z. 
856 |z Get fulltext  |u https://hdl.handle.net/1721.1/129833 
520 |a We introduce visual deprojection: The task of recovering an image or video that has been collapsed along a dimension. Projections arise in various contexts, such as long-exposure photography, where a dynamic scene is collapsed in time to produce a motion-blurred image, and corner cameras, where reflected light from a scene is collapsed along a spatial dimension because of an edge occluder to yield a 1D video. Deprojection is ill-posed - often there are many plausible solutions for a given input. We first propose a probabilistic model capturing the ambiguity of the task. We then present a variational inference strategy using convolutional neural networks as functional approximators. Sampling from the inference network at test time yields plausible candidates from the distribution of original signals that are consistent with a given input projection. We evaluate the method on several datasets for both spatial and temporal deprojection tasks. We first demonstrate the method can recover human gait videos and face images from spatial projections, and then show that it can recover videos of moving digits from dramatically motion-blurred images obtained via temporal projection. 
520 |a United States. Defense Advanced Research Projects Agency. Revolutionary Enhancement of Visibility by Exploiting Active Light-fields Program (Contract HR0011-16-C-0030) 
520 |a National Institutes of Health (U.S.) (Grant 1R21AG050122) 
546 |a en 
655 7 |a Article 
773 |t 10.1109/ICCV.2019.00026 
773 |t Proceedings of the IEEE International Conference on Computer Vision