Inference Machines: Parsing Scenes via Iterated Predictions

Extracting a rich representation of an environment from visual sensor readings canbenefit many tasks in robotics, e.g., path planning, mapping, and object manipulation.While important progress has been made, it remains a difficult problem to effectivelyparse entire scenes, i.e., to recognize semanti...

Full description

Bibliographic Details
Main Author:	Munoz, Daniel
Format:	Others
Published:	Research Showcase @ CMU 2013
Subjects:	Robotics
Online Access:	http://repository.cmu.edu/dissertations/305 http://repository.cmu.edu/cgi/viewcontent.cgi?article=1309&context=dissertations

id	ndltd-cmu.edu-oai-repository.cmu.edu-dissertations-1309
record_format	oai_dc
spelling	ndltd-cmu.edu-oai-repository.cmu.edu-dissertations-13092014-07-24T15:36:16Z Inference Machines: Parsing Scenes via Iterated Predictions Munoz, Daniel Extracting a rich representation of an environment from visual sensor readings canbenefit many tasks in robotics, e.g., path planning, mapping, and object manipulation.While important progress has been made, it remains a difficult problem to effectivelyparse entire scenes, i.e., to recognize semantic objects, man-made structures, and landforms.This process requires not only recognizing individual entities but also understandingthe contextual relations among them. The prevalent approach to encode such relationships is to use a joint probabilistic orenergy-based model which enables one to naturally write down these interactions. Unfortunately,performing exact inference over these expressive models is often intractableand instead we can only approximate the solutions. While there exists a set of sophisticatedapproximate inference techniques to choose from, the combination of learning andapproximate inference for these expressive models is still poorly understood in theoryand limited in practice. Furthermore, using approximate inference on any learned modeloften leads to suboptimal predictions due to the inherent approximations. As we ultimately care about predicting the correct labeling of a scene, and notnecessarily learning a joint model of the data, this work proposes to instead view theapproximate inference process as a modular procedure that is directly trained in orderto produce a correct labeling of the scene. Inspired by early hierarchical models in thecomputer vision literature for scene parsing, the proposed inference procedure is structuredto incorporate both feature descriptors and contextual cues computed at multipleresolutions within the scene. We demonstrate that this inference machine frameworkfor parsing scenes via iterated predictions offers the best of both worlds: state-of-the-artclassification accuracy and computational efficiency when processing images and/orunorganized 3-D point clouds. Additionally, we address critical problems that arise inpractice when parsing scenes on board real-world systems: integrating data from multiplesensor modalities and efficiently processing data that is continuously streaming fromthe sensors. 2013-06-06T07:00:00Z text application/pdf http://repository.cmu.edu/dissertations/305 http://repository.cmu.edu/cgi/viewcontent.cgi?article=1309&context=dissertations Dissertations Research Showcase @ CMU Robotics
collection	NDLTD
format	Others
sources	NDLTD
topic	Robotics
spellingShingle	Robotics Munoz, Daniel Inference Machines: Parsing Scenes via Iterated Predictions
description	Extracting a rich representation of an environment from visual sensor readings canbenefit many tasks in robotics, e.g., path planning, mapping, and object manipulation.While important progress has been made, it remains a difficult problem to effectivelyparse entire scenes, i.e., to recognize semantic objects, man-made structures, and landforms.This process requires not only recognizing individual entities but also understandingthe contextual relations among them. The prevalent approach to encode such relationships is to use a joint probabilistic orenergy-based model which enables one to naturally write down these interactions. Unfortunately,performing exact inference over these expressive models is often intractableand instead we can only approximate the solutions. While there exists a set of sophisticatedapproximate inference techniques to choose from, the combination of learning andapproximate inference for these expressive models is still poorly understood in theoryand limited in practice. Furthermore, using approximate inference on any learned modeloften leads to suboptimal predictions due to the inherent approximations. As we ultimately care about predicting the correct labeling of a scene, and notnecessarily learning a joint model of the data, this work proposes to instead view theapproximate inference process as a modular procedure that is directly trained in orderto produce a correct labeling of the scene. Inspired by early hierarchical models in thecomputer vision literature for scene parsing, the proposed inference procedure is structuredto incorporate both feature descriptors and contextual cues computed at multipleresolutions within the scene. We demonstrate that this inference machine frameworkfor parsing scenes via iterated predictions offers the best of both worlds: state-of-the-artclassification accuracy and computational efficiency when processing images and/orunorganized 3-D point clouds. Additionally, we address critical problems that arise inpractice when parsing scenes on board real-world systems: integrating data from multiplesensor modalities and efficiently processing data that is continuously streaming fromthe sensors.
author	Munoz, Daniel
author_facet	Munoz, Daniel
author_sort	Munoz, Daniel
title	Inference Machines: Parsing Scenes via Iterated Predictions
title_short	Inference Machines: Parsing Scenes via Iterated Predictions
title_full	Inference Machines: Parsing Scenes via Iterated Predictions
title_fullStr	Inference Machines: Parsing Scenes via Iterated Predictions
title_full_unstemmed	Inference Machines: Parsing Scenes via Iterated Predictions
title_sort	inference machines: parsing scenes via iterated predictions
publisher	Research Showcase @ CMU
publishDate	2013
url	http://repository.cmu.edu/dissertations/305 http://repository.cmu.edu/cgi/viewcontent.cgi?article=1309&context=dissertations
work_keys_str_mv	AT munozdaniel inferencemachinesparsingscenesviaiteratedpredictions
_version_	1716709425492787200

Inference Machines: Parsing Scenes via Iterated Predictions

Similar Items