Leveraging Contextual Information for Monocular Depth Estimation

Humans strongly rely on visual cues to understand scenes such as segmenting, detecting objects, or measuring the distance from nearby objects. Recent studies suggest that deep neural networks can take advantage of contextual representation for the estimation of a depth map for a given image. Therefo...

Full description

Bibliographic Details
Main Authors: Doyeon Kim, Sihaeng Lee, Janghyeon Lee, Junmo Kim
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9165723/
id doaj-4944a31438234c47838913c4149cb437
record_format Article
spelling doaj-4944a31438234c47838913c4149cb4372021-03-30T01:57:53ZengIEEEIEEE Access2169-35362020-01-01814780814781710.1109/ACCESS.2020.30160089165723Leveraging Contextual Information for Monocular Depth EstimationDoyeon Kim0https://orcid.org/0000-0003-3717-7275Sihaeng Lee1https://orcid.org/0000-0001-5328-2011Janghyeon Lee2https://orcid.org/0000-0002-8599-4678Junmo Kim3https://orcid.org/0000-0002-7174-7932School of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South KoreaDivision of Future Vehicle, Korea Advanced Institute of Science and Technology, Daejeon, South KoreaSchool of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South KoreaSchool of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South KoreaHumans strongly rely on visual cues to understand scenes such as segmenting, detecting objects, or measuring the distance from nearby objects. Recent studies suggest that deep neural networks can take advantage of contextual representation for the estimation of a depth map for a given image. Therefore, focusing on the scene context can be beneficial for successful depth estimation. In this study, a novel network architecture is proposed to improve the performance by leveraging the contextual information for monocular depth estimation. We introduce a depth prediction network with the proposed attentive skip connection and a global context module, to obtain meaningful semantic features and enhance the performance of the model. Furthermore, our model is validated through several experiments on the KITTI and NYU Depth V2 datasets. The experimental results demonstrate the effectiveness of the proposed network, which achieves a state-of-the-art monocular depth estimation performance while maintaining a high running speed.https://ieeexplore.ieee.org/document/9165723/Monocular depth estimationcontextual information
collection DOAJ
language English
format Article
sources DOAJ
author Doyeon Kim
Sihaeng Lee
Janghyeon Lee
Junmo Kim
spellingShingle Doyeon Kim
Sihaeng Lee
Janghyeon Lee
Junmo Kim
Leveraging Contextual Information for Monocular Depth Estimation
IEEE Access
Monocular depth estimation
contextual information
author_facet Doyeon Kim
Sihaeng Lee
Janghyeon Lee
Junmo Kim
author_sort Doyeon Kim
title Leveraging Contextual Information for Monocular Depth Estimation
title_short Leveraging Contextual Information for Monocular Depth Estimation
title_full Leveraging Contextual Information for Monocular Depth Estimation
title_fullStr Leveraging Contextual Information for Monocular Depth Estimation
title_full_unstemmed Leveraging Contextual Information for Monocular Depth Estimation
title_sort leveraging contextual information for monocular depth estimation
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description Humans strongly rely on visual cues to understand scenes such as segmenting, detecting objects, or measuring the distance from nearby objects. Recent studies suggest that deep neural networks can take advantage of contextual representation for the estimation of a depth map for a given image. Therefore, focusing on the scene context can be beneficial for successful depth estimation. In this study, a novel network architecture is proposed to improve the performance by leveraging the contextual information for monocular depth estimation. We introduce a depth prediction network with the proposed attentive skip connection and a global context module, to obtain meaningful semantic features and enhance the performance of the model. Furthermore, our model is validated through several experiments on the KITTI and NYU Depth V2 datasets. The experimental results demonstrate the effectiveness of the proposed network, which achieves a state-of-the-art monocular depth estimation performance while maintaining a high running speed.
topic Monocular depth estimation
contextual information
url https://ieeexplore.ieee.org/document/9165723/
work_keys_str_mv AT doyeonkim leveragingcontextualinformationformonoculardepthestimation
AT sihaenglee leveragingcontextualinformationformonoculardepthestimation
AT janghyeonlee leveragingcontextualinformationformonoculardepthestimation
AT junmokim leveragingcontextualinformationformonoculardepthestimation
_version_ 1724186099702562816