Deep Learning Based Depth Estimation and Analysis for 360° Stereo Cameras
碩士 === 國立交通大學 === 電子研究所 === 107 === The 360 degree virtual view synthesis plays an important role in Virtual Reality and the depth map is the key information to reconstruct the 3D world. In this study, we use two spherical cameras to form a 360° stereo system, which can capture all the surrounding s...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2019
|
Online Access: | http://ndltd.ncl.edu.tw/handle/y7mjfw |
id |
ndltd-TW-107NCTU5428199 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-107NCTU54281992019-11-26T05:16:54Z http://ndltd.ncl.edu.tw/handle/y7mjfw Deep Learning Based Depth Estimation and Analysis for 360° Stereo Cameras 基於深度學習之雙球型全景相機環景深度圖估測與分析 Hsieh, Meng-Hsun 謝孟勳 碩士 國立交通大學 電子研究所 107 The 360 degree virtual view synthesis plays an important role in Virtual Reality and the depth map is the key information to reconstruct the 3D world. In this study, we use two spherical cameras to form a 360° stereo system, which can capture all the surrounding scene in two views. We then use these two spherical images to estimate the spherical depth map. We developed a depth estimation procedure on the spherical stereo images using an existing neural network, PSMNet. To train the network for spherical disparity estimation, we built a panorama stereo image dataset based on the SYNTHIA dataset, which has disparity ground truth. More importantly, we investigated the limits of spherical image depth estimation. Different from the disparity definition on the perspective view stereo, the spherical disparity is measured as the angle difference of the same object point on two views. Thus, the object aligned with the baseline has zero spherical disparity. Due to image plane pixel resolution, the maximum sensing distance for spherical disparity estimation was derived. Also, we studied the occlusion problem of a surface in spherical stereo, and derived the minimum reliable sensing distance. Both distance limits are functions of baseline. These properties help us in choosing an appropriate baseline length for constructing a spherical stereo. In our experiments, we performed depth estimation on both synthetic images and real scene images, and evaluated the performance on synthetic images with the ground truth depth. In the SYNTHIA test set, we can achieve an error rate of 2.18% using the KITTI benchmark D1 error criterion, which is lower than the original PSMNet tested on the KITTI dataset. At the end, we generated the synthetic views using Facebook 3D photo tools and our estimated depth maps. The good subjective quality of the synthesized images indicates that our estimated depth map is rather accurate. Hang, Hsueh-Ming 杭學鳴 2019 學位論文 ; thesis 73 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立交通大學 === 電子研究所 === 107 === The 360 degree virtual view synthesis plays an important role in Virtual Reality and the depth map is the key information to reconstruct the 3D world. In this study, we use two spherical cameras to form a 360° stereo system, which can capture all the surrounding scene in two views. We then use these two spherical images to estimate the spherical depth map. We developed a depth estimation procedure on the spherical stereo images using an existing neural network, PSMNet. To train the network for spherical disparity estimation, we built a panorama stereo image dataset based on the SYNTHIA dataset, which has disparity ground truth.
More importantly, we investigated the limits of spherical image depth estimation. Different from the disparity definition on the perspective view stereo, the spherical disparity is measured as the angle difference of the same object point on two views. Thus, the object aligned with the baseline has zero spherical disparity. Due to image plane pixel resolution, the maximum sensing distance for spherical disparity estimation was derived. Also, we studied the occlusion problem of a surface in spherical stereo, and derived the minimum reliable sensing distance. Both distance limits are functions of baseline. These properties help us in choosing an appropriate baseline length for constructing a spherical stereo.
In our experiments, we performed depth estimation on both synthetic images and real scene images, and evaluated the performance on synthetic images with the ground truth depth. In the SYNTHIA test set, we can achieve an error rate of 2.18% using the KITTI benchmark D1 error criterion, which is lower than the original PSMNet tested on the KITTI dataset. At the end, we generated the synthetic views using Facebook 3D photo tools and our estimated depth maps. The good subjective quality of the synthesized images indicates that our estimated depth map is rather accurate.
|
author2 |
Hang, Hsueh-Ming |
author_facet |
Hang, Hsueh-Ming Hsieh, Meng-Hsun 謝孟勳 |
author |
Hsieh, Meng-Hsun 謝孟勳 |
spellingShingle |
Hsieh, Meng-Hsun 謝孟勳 Deep Learning Based Depth Estimation and Analysis for 360° Stereo Cameras |
author_sort |
Hsieh, Meng-Hsun |
title |
Deep Learning Based Depth Estimation and Analysis for 360° Stereo Cameras |
title_short |
Deep Learning Based Depth Estimation and Analysis for 360° Stereo Cameras |
title_full |
Deep Learning Based Depth Estimation and Analysis for 360° Stereo Cameras |
title_fullStr |
Deep Learning Based Depth Estimation and Analysis for 360° Stereo Cameras |
title_full_unstemmed |
Deep Learning Based Depth Estimation and Analysis for 360° Stereo Cameras |
title_sort |
deep learning based depth estimation and analysis for 360° stereo cameras |
publishDate |
2019 |
url |
http://ndltd.ncl.edu.tw/handle/y7mjfw |
work_keys_str_mv |
AT hsiehmenghsun deeplearningbaseddepthestimationandanalysisfor360stereocameras AT xièmèngxūn deeplearningbaseddepthestimationandanalysisfor360stereocameras AT hsiehmenghsun jīyúshēndùxuéxízhīshuāngqiúxíngquánjǐngxiāngjīhuánjǐngshēndùtúgūcèyǔfēnxī AT xièmèngxūn jīyúshēndùxuéxízhīshuāngqiúxíngquánjǐngxiāngjīhuánjǐngshēndùtúgūcèyǔfēnxī |
_version_ |
1719296469507244032 |