Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural Network

In order to obtain the distances between the surrounding objects and the vehicle in the traffic scene in front of the vehicle, a monocular visual depth estimation method based on the depthwise separable convolutional neural network is proposed in this study. First, features containing shallow depth...

Full description

Bibliographic Details
Main Authors:	Jianzhong Yuan, Wujie Zhou, Sijia Lv, Yuzhen Chen
Format:	Article
Language:	English
Published:	Hindawi Limited 2019-01-01
Series:	Journal of Electrical and Computer Engineering
Online Access:	http://dx.doi.org/10.1155/2019/9340129

id	doaj-57f3c8784324401fb5992835c2710e93
record_format	Article
spelling	doaj-57f3c8784324401fb5992835c2710e932021-07-02T06:32:27ZengHindawi LimitedJournal of Electrical and Computer Engineering2090-01472090-01552019-01-01201910.1155/2019/93401299340129Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural NetworkJianzhong Yuan0Wujie Zhou1Sijia Lv2Yuzhen Chen3School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, ChinaSchool of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, ChinaSchool of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, ChinaSchool of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, ChinaIn order to obtain the distances between the surrounding objects and the vehicle in the traffic scene in front of the vehicle, a monocular visual depth estimation method based on the depthwise separable convolutional neural network is proposed in this study. First, features containing shallow depth information were extracted from the RGB images using the convolution layers and maximum pooling layers. Subsampling operations were also performed on these images. Subsequently, features containing advanced depth information were extracted using a block based on an ensemble of convolution layers and a block based on depth separable convolution layers. The output from all different blocks is combined afterwards. Finally, transposed convolution layers were used for upsampling the feature maps to the same size with the original RGB image. During the upsampling process, skip connections were used to merge the features containing shallow depth information that was obtained from the convolution operation through the depthwise separable convolution layers. The depthwise separable convolution layers can provide more accurate depth information features for estimating the monocular visual depth. At the same time, they require reduced computational cost and fewer parameter numbers while providing a similar level (or slightly better) computing performance. Integrating multiple simple convolutions into a block not only increases the overall depth of the neural network but also enables a more accurate extraction of the advanced features in the neural network. Combining the output from multiple blocks can prevent the loss of features containing important depth information. The testing results show that the depthwise separable convolutional neural network provides a superior performance than the other monocular visual depth estimation methods. Therefore, applying depthwise separable convolution layers in the neural network is a more effective and accurate approach for estimating the visual depth.http://dx.doi.org/10.1155/2019/9340129
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Jianzhong Yuan Wujie Zhou Sijia Lv Yuzhen Chen
spellingShingle	Jianzhong Yuan Wujie Zhou Sijia Lv Yuzhen Chen Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural Network Journal of Electrical and Computer Engineering
author_facet	Jianzhong Yuan Wujie Zhou Sijia Lv Yuzhen Chen
author_sort	Jianzhong Yuan
title	Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural Network
title_short	Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural Network
title_full	Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural Network
title_fullStr	Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural Network
title_full_unstemmed	Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural Network
title_sort	traffic scene depth analysis based on depthwise separable convolutional neural network
publisher	Hindawi Limited
series	Journal of Electrical and Computer Engineering
issn	2090-0147 2090-0155
publishDate	2019-01-01
description	In order to obtain the distances between the surrounding objects and the vehicle in the traffic scene in front of the vehicle, a monocular visual depth estimation method based on the depthwise separable convolutional neural network is proposed in this study. First, features containing shallow depth information were extracted from the RGB images using the convolution layers and maximum pooling layers. Subsampling operations were also performed on these images. Subsequently, features containing advanced depth information were extracted using a block based on an ensemble of convolution layers and a block based on depth separable convolution layers. The output from all different blocks is combined afterwards. Finally, transposed convolution layers were used for upsampling the feature maps to the same size with the original RGB image. During the upsampling process, skip connections were used to merge the features containing shallow depth information that was obtained from the convolution operation through the depthwise separable convolution layers. The depthwise separable convolution layers can provide more accurate depth information features for estimating the monocular visual depth. At the same time, they require reduced computational cost and fewer parameter numbers while providing a similar level (or slightly better) computing performance. Integrating multiple simple convolutions into a block not only increases the overall depth of the neural network but also enables a more accurate extraction of the advanced features in the neural network. Combining the output from multiple blocks can prevent the loss of features containing important depth information. The testing results show that the depthwise separable convolutional neural network provides a superior performance than the other monocular visual depth estimation methods. Therefore, applying depthwise separable convolution layers in the neural network is a more effective and accurate approach for estimating the visual depth.
url	http://dx.doi.org/10.1155/2019/9340129
work_keys_str_mv	AT jianzhongyuan trafficscenedepthanalysisbasedondepthwiseseparableconvolutionalneuralnetwork AT wujiezhou trafficscenedepthanalysisbasedondepthwiseseparableconvolutionalneuralnetwork AT sijialv trafficscenedepthanalysisbasedondepthwiseseparableconvolutionalneuralnetwork AT yuzhenchen trafficscenedepthanalysisbasedondepthwiseseparableconvolutionalneuralnetwork
_version_	1721337048441290752

Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural Network

Similar Items