Deep Learning Based Depth Estimation and Analysis for 360° Stereo Cameras

碩士 === 國立交通大學 === 電子研究所 === 107 === The 360 degree virtual view synthesis plays an important role in Virtual Reality and the depth map is the key information to reconstruct the 3D world. In this study, we use two spherical cameras to form a 360° stereo system, which can capture all the surrounding s...

Full description

Bibliographic Details
Main Authors: Hsieh, Meng-Hsun, 謝孟勳
Other Authors: Hang, Hsueh-Ming
Format: Others
Language:en_US
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/y7mjfw
id ndltd-TW-107NCTU5428199
record_format oai_dc
spelling ndltd-TW-107NCTU54281992019-11-26T05:16:54Z http://ndltd.ncl.edu.tw/handle/y7mjfw Deep Learning Based Depth Estimation and Analysis for 360° Stereo Cameras 基於深度學習之雙球型全景相機環景深度圖估測與分析 Hsieh, Meng-Hsun 謝孟勳 碩士 國立交通大學 電子研究所 107 The 360 degree virtual view synthesis plays an important role in Virtual Reality and the depth map is the key information to reconstruct the 3D world. In this study, we use two spherical cameras to form a 360° stereo system, which can capture all the surrounding scene in two views. We then use these two spherical images to estimate the spherical depth map. We developed a depth estimation procedure on the spherical stereo images using an existing neural network, PSMNet. To train the network for spherical disparity estimation, we built a panorama stereo image dataset based on the SYNTHIA dataset, which has disparity ground truth. More importantly, we investigated the limits of spherical image depth estimation. Different from the disparity definition on the perspective view stereo, the spherical disparity is measured as the angle difference of the same object point on two views. Thus, the object aligned with the baseline has zero spherical disparity. Due to image plane pixel resolution, the maximum sensing distance for spherical disparity estimation was derived. Also, we studied the occlusion problem of a surface in spherical stereo, and derived the minimum reliable sensing distance. Both distance limits are functions of baseline. These properties help us in choosing an appropriate baseline length for constructing a spherical stereo. In our experiments, we performed depth estimation on both synthetic images and real scene images, and evaluated the performance on synthetic images with the ground truth depth. In the SYNTHIA test set, we can achieve an error rate of 2.18% using the KITTI benchmark D1 error criterion, which is lower than the original PSMNet tested on the KITTI dataset. At the end, we generated the synthetic views using Facebook 3D photo tools and our estimated depth maps. The good subjective quality of the synthesized images indicates that our estimated depth map is rather accurate. Hang, Hsueh-Ming 杭學鳴 2019 學位論文 ; thesis 73 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立交通大學 === 電子研究所 === 107 === The 360 degree virtual view synthesis plays an important role in Virtual Reality and the depth map is the key information to reconstruct the 3D world. In this study, we use two spherical cameras to form a 360° stereo system, which can capture all the surrounding scene in two views. We then use these two spherical images to estimate the spherical depth map. We developed a depth estimation procedure on the spherical stereo images using an existing neural network, PSMNet. To train the network for spherical disparity estimation, we built a panorama stereo image dataset based on the SYNTHIA dataset, which has disparity ground truth. More importantly, we investigated the limits of spherical image depth estimation. Different from the disparity definition on the perspective view stereo, the spherical disparity is measured as the angle difference of the same object point on two views. Thus, the object aligned with the baseline has zero spherical disparity. Due to image plane pixel resolution, the maximum sensing distance for spherical disparity estimation was derived. Also, we studied the occlusion problem of a surface in spherical stereo, and derived the minimum reliable sensing distance. Both distance limits are functions of baseline. These properties help us in choosing an appropriate baseline length for constructing a spherical stereo. In our experiments, we performed depth estimation on both synthetic images and real scene images, and evaluated the performance on synthetic images with the ground truth depth. In the SYNTHIA test set, we can achieve an error rate of 2.18% using the KITTI benchmark D1 error criterion, which is lower than the original PSMNet tested on the KITTI dataset. At the end, we generated the synthetic views using Facebook 3D photo tools and our estimated depth maps. The good subjective quality of the synthesized images indicates that our estimated depth map is rather accurate.
author2 Hang, Hsueh-Ming
author_facet Hang, Hsueh-Ming
Hsieh, Meng-Hsun
謝孟勳
author Hsieh, Meng-Hsun
謝孟勳
spellingShingle Hsieh, Meng-Hsun
謝孟勳
Deep Learning Based Depth Estimation and Analysis for 360° Stereo Cameras
author_sort Hsieh, Meng-Hsun
title Deep Learning Based Depth Estimation and Analysis for 360° Stereo Cameras
title_short Deep Learning Based Depth Estimation and Analysis for 360° Stereo Cameras
title_full Deep Learning Based Depth Estimation and Analysis for 360° Stereo Cameras
title_fullStr Deep Learning Based Depth Estimation and Analysis for 360° Stereo Cameras
title_full_unstemmed Deep Learning Based Depth Estimation and Analysis for 360° Stereo Cameras
title_sort deep learning based depth estimation and analysis for 360° stereo cameras
publishDate 2019
url http://ndltd.ncl.edu.tw/handle/y7mjfw
work_keys_str_mv AT hsiehmenghsun deeplearningbaseddepthestimationandanalysisfor360stereocameras
AT xièmèngxūn deeplearningbaseddepthestimationandanalysisfor360stereocameras
AT hsiehmenghsun jīyúshēndùxuéxízhīshuāngqiúxíngquánjǐngxiāngjīhuánjǐngshēndùtúgūcèyǔfēnxī
AT xièmèngxūn jīyúshēndùxuéxízhīshuāngqiúxíngquánjǐngxiāngjīhuánjǐngshēndùtúgūcèyǔfēnxī
_version_ 1719296469507244032