An Embodied Multi-Sensor Fusion Approach to Visual Motion Estimation Using Unsupervised Deep Networks

Aimed at improving size, weight, and power (SWaP)-constrained robotic vision-aided state estimation, we describe our unsupervised, deep convolutional-deconvolutional sensor fusion network, Multi-Hypothesis DeepEfference (MHDE). MHDE learns to intelligently combine noisy heterogeneous sensor data to...

Full description

Bibliographic Details
Main Authors: E. Jared Shamwell, William D. Nothwang, Donald Perlis
Format: Article
Language:English
Published: MDPI AG 2018-05-01
Series:Sensors
Subjects:
Online Access:http://www.mdpi.com/1424-8220/18/5/1427
id doaj-dcce72542fa14d10ade143f7e9d7e8b4
record_format Article
spelling doaj-dcce72542fa14d10ade143f7e9d7e8b42020-11-25T01:06:23ZengMDPI AGSensors1424-82202018-05-01185142710.3390/s18051427s18051427An Embodied Multi-Sensor Fusion Approach to Visual Motion Estimation Using Unsupervised Deep NetworksE. Jared Shamwell0William D. Nothwang1Donald Perlis2Sensors and Electron Devices Directorate, US Army Research Laboratory, 2800 Powder Mill Rd, Adelphi MD 20783, USASensors and Electron Devices Directorate, US Army Research Laboratory, 2800 Powder Mill Rd, Adelphi MD 20783, USADepartment of Computer Science, University of Maryland, A.V. Williams Building, College Park, MD 20740, USAAimed at improving size, weight, and power (SWaP)-constrained robotic vision-aided state estimation, we describe our unsupervised, deep convolutional-deconvolutional sensor fusion network, Multi-Hypothesis DeepEfference (MHDE). MHDE learns to intelligently combine noisy heterogeneous sensor data to predict several probable hypotheses for the dense, pixel-level correspondence between a source image and an unseen target image. We show how our multi-hypothesis formulation provides increased robustness against dynamic, heteroscedastic sensor and motion noise by computing hypothesis image mappings and predictions at 76–357 Hz depending on the number of hypotheses being generated. MHDE fuses noisy, heterogeneous sensory inputs using two parallel, inter-connected architectural pathways and n (1–20 in this work) multi-hypothesis generating sub-pathways to produce n global correspondence estimates between a source and a target image. We evaluated MHDE on the KITTI Odometry dataset and benchmarked it against the vision-only DeepMatching and Deformable Spatial Pyramids algorithms and were able to demonstrate a significant runtime decrease and a performance increase compared to the next-best performing method.http://www.mdpi.com/1424-8220/18/5/1427deep learningsensor fusionoptical flow
collection DOAJ
language English
format Article
sources DOAJ
author E. Jared Shamwell
William D. Nothwang
Donald Perlis
spellingShingle E. Jared Shamwell
William D. Nothwang
Donald Perlis
An Embodied Multi-Sensor Fusion Approach to Visual Motion Estimation Using Unsupervised Deep Networks
Sensors
deep learning
sensor fusion
optical flow
author_facet E. Jared Shamwell
William D. Nothwang
Donald Perlis
author_sort E. Jared Shamwell
title An Embodied Multi-Sensor Fusion Approach to Visual Motion Estimation Using Unsupervised Deep Networks
title_short An Embodied Multi-Sensor Fusion Approach to Visual Motion Estimation Using Unsupervised Deep Networks
title_full An Embodied Multi-Sensor Fusion Approach to Visual Motion Estimation Using Unsupervised Deep Networks
title_fullStr An Embodied Multi-Sensor Fusion Approach to Visual Motion Estimation Using Unsupervised Deep Networks
title_full_unstemmed An Embodied Multi-Sensor Fusion Approach to Visual Motion Estimation Using Unsupervised Deep Networks
title_sort embodied multi-sensor fusion approach to visual motion estimation using unsupervised deep networks
publisher MDPI AG
series Sensors
issn 1424-8220
publishDate 2018-05-01
description Aimed at improving size, weight, and power (SWaP)-constrained robotic vision-aided state estimation, we describe our unsupervised, deep convolutional-deconvolutional sensor fusion network, Multi-Hypothesis DeepEfference (MHDE). MHDE learns to intelligently combine noisy heterogeneous sensor data to predict several probable hypotheses for the dense, pixel-level correspondence between a source image and an unseen target image. We show how our multi-hypothesis formulation provides increased robustness against dynamic, heteroscedastic sensor and motion noise by computing hypothesis image mappings and predictions at 76–357 Hz depending on the number of hypotheses being generated. MHDE fuses noisy, heterogeneous sensory inputs using two parallel, inter-connected architectural pathways and n (1–20 in this work) multi-hypothesis generating sub-pathways to produce n global correspondence estimates between a source and a target image. We evaluated MHDE on the KITTI Odometry dataset and benchmarked it against the vision-only DeepMatching and Deformable Spatial Pyramids algorithms and were able to demonstrate a significant runtime decrease and a performance increase compared to the next-best performing method.
topic deep learning
sensor fusion
optical flow
url http://www.mdpi.com/1424-8220/18/5/1427
work_keys_str_mv AT ejaredshamwell anembodiedmultisensorfusionapproachtovisualmotionestimationusingunsuperviseddeepnetworks
AT williamdnothwang anembodiedmultisensorfusionapproachtovisualmotionestimationusingunsuperviseddeepnetworks
AT donaldperlis anembodiedmultisensorfusionapproachtovisualmotionestimationusingunsuperviseddeepnetworks
AT ejaredshamwell embodiedmultisensorfusionapproachtovisualmotionestimationusingunsuperviseddeepnetworks
AT williamdnothwang embodiedmultisensorfusionapproachtovisualmotionestimationusingunsuperviseddeepnetworks
AT donaldperlis embodiedmultisensorfusionapproachtovisualmotionestimationusingunsuperviseddeepnetworks
_version_ 1725190512362651648