LEARNING MULTI-MODAL FEATURES FOR DENSE MATCHING-BASED CONFIDENCE ESTIMATION

In recent years, the ability to assess the uncertainty of depth estimates in the context of dense stereo matching has received increased attention due to its potential to detect erroneous estimates. Especially, the introduction of deep learning approaches greatly improved general performance, with f...

Full description

Bibliographic Details
Main Authors: K. Heinrich, M. Mehltretter
Format: Article
Language:English
Published: Copernicus Publications 2021-06-01
Series:The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Online Access:https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLIII-B2-2021/91/2021/isprs-archives-XLIII-B2-2021-91-2021.pdf
id doaj-7fdfb1cd31174cf5a0d3dc2abce50d4d
record_format Article
spelling doaj-7fdfb1cd31174cf5a0d3dc2abce50d4d2021-06-28T22:19:07ZengCopernicus PublicationsThe International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences1682-17502194-90342021-06-01XLIII-B2-2021919910.5194/isprs-archives-XLIII-B2-2021-91-2021LEARNING MULTI-MODAL FEATURES FOR DENSE MATCHING-BASED CONFIDENCE ESTIMATIONK. Heinrich0M. Mehltretter1Institute of Photogrammetry and GeoInformation, Leibniz University Hannover, GermanyInstitute of Photogrammetry and GeoInformation, Leibniz University Hannover, GermanyIn recent years, the ability to assess the uncertainty of depth estimates in the context of dense stereo matching has received increased attention due to its potential to detect erroneous estimates. Especially, the introduction of deep learning approaches greatly improved general performance, with feature extraction from multiple modalities proving to be highly advantageous due to the unique and different characteristics of each modality. However, most work in the literature focuses on using only mono- or bi- or rarely tri-modal input, not considering the potential effectiveness of modalities, going beyond tri-modality. To further advance the idea of combining different types of features for confidence estimation, in this work, a CNN-based approach is proposed, exploiting uncertainty cues from up to four modalities. For this purpose, a state-of-the-art local-global approach is used as baseline and extended accordingly. Additionally, a novel disparity-based modality named warped difference is presented to support uncertainty estimation at common failure cases of dense stereo matching. The general validity and improved performance of the proposed approach is demonstrated and compared against the bi-modal baseline in an evaluation on three datasets using two common dense stereo matching techniques.https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLIII-B2-2021/91/2021/isprs-archives-XLIII-B2-2021-91-2021.pdf
collection DOAJ
language English
format Article
sources DOAJ
author K. Heinrich
M. Mehltretter
spellingShingle K. Heinrich
M. Mehltretter
LEARNING MULTI-MODAL FEATURES FOR DENSE MATCHING-BASED CONFIDENCE ESTIMATION
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
author_facet K. Heinrich
M. Mehltretter
author_sort K. Heinrich
title LEARNING MULTI-MODAL FEATURES FOR DENSE MATCHING-BASED CONFIDENCE ESTIMATION
title_short LEARNING MULTI-MODAL FEATURES FOR DENSE MATCHING-BASED CONFIDENCE ESTIMATION
title_full LEARNING MULTI-MODAL FEATURES FOR DENSE MATCHING-BASED CONFIDENCE ESTIMATION
title_fullStr LEARNING MULTI-MODAL FEATURES FOR DENSE MATCHING-BASED CONFIDENCE ESTIMATION
title_full_unstemmed LEARNING MULTI-MODAL FEATURES FOR DENSE MATCHING-BASED CONFIDENCE ESTIMATION
title_sort learning multi-modal features for dense matching-based confidence estimation
publisher Copernicus Publications
series The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
issn 1682-1750
2194-9034
publishDate 2021-06-01
description In recent years, the ability to assess the uncertainty of depth estimates in the context of dense stereo matching has received increased attention due to its potential to detect erroneous estimates. Especially, the introduction of deep learning approaches greatly improved general performance, with feature extraction from multiple modalities proving to be highly advantageous due to the unique and different characteristics of each modality. However, most work in the literature focuses on using only mono- or bi- or rarely tri-modal input, not considering the potential effectiveness of modalities, going beyond tri-modality. To further advance the idea of combining different types of features for confidence estimation, in this work, a CNN-based approach is proposed, exploiting uncertainty cues from up to four modalities. For this purpose, a state-of-the-art local-global approach is used as baseline and extended accordingly. Additionally, a novel disparity-based modality named warped difference is presented to support uncertainty estimation at common failure cases of dense stereo matching. The general validity and improved performance of the proposed approach is demonstrated and compared against the bi-modal baseline in an evaluation on three datasets using two common dense stereo matching techniques.
url https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLIII-B2-2021/91/2021/isprs-archives-XLIII-B2-2021-91-2021.pdf
work_keys_str_mv AT kheinrich learningmultimodalfeaturesfordensematchingbasedconfidenceestimation
AT mmehltretter learningmultimodalfeaturesfordensematchingbasedconfidenceestimation
_version_ 1721355888207331328