Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural Network

In order to obtain the distances between the surrounding objects and the vehicle in the traffic scene in front of the vehicle, a monocular visual depth estimation method based on the depthwise separable convolutional neural network is proposed in this study. First, features containing shallow depth...

Full description

Bibliographic Details
Main Authors: Jianzhong Yuan, Wujie Zhou, Sijia Lv, Yuzhen Chen
Format: Article
Language:English
Published: Hindawi Limited 2019-01-01
Series:Journal of Electrical and Computer Engineering
Online Access:http://dx.doi.org/10.1155/2019/9340129
id doaj-57f3c8784324401fb5992835c2710e93
record_format Article
spelling doaj-57f3c8784324401fb5992835c2710e932021-07-02T06:32:27ZengHindawi LimitedJournal of Electrical and Computer Engineering2090-01472090-01552019-01-01201910.1155/2019/93401299340129Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural NetworkJianzhong Yuan0Wujie Zhou1Sijia Lv2Yuzhen Chen3School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, ChinaSchool of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, ChinaSchool of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, ChinaSchool of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, ChinaIn order to obtain the distances between the surrounding objects and the vehicle in the traffic scene in front of the vehicle, a monocular visual depth estimation method based on the depthwise separable convolutional neural network is proposed in this study. First, features containing shallow depth information were extracted from the RGB images using the convolution layers and maximum pooling layers. Subsampling operations were also performed on these images. Subsequently, features containing advanced depth information were extracted using a block based on an ensemble of convolution layers and a block based on depth separable convolution layers. The output from all different blocks is combined afterwards. Finally, transposed convolution layers were used for upsampling the feature maps to the same size with the original RGB image. During the upsampling process, skip connections were used to merge the features containing shallow depth information that was obtained from the convolution operation through the depthwise separable convolution layers. The depthwise separable convolution layers can provide more accurate depth information features for estimating the monocular visual depth. At the same time, they require reduced computational cost and fewer parameter numbers while providing a similar level (or slightly better) computing performance. Integrating multiple simple convolutions into a block not only increases the overall depth of the neural network but also enables a more accurate extraction of the advanced features in the neural network. Combining the output from multiple blocks can prevent the loss of features containing important depth information. The testing results show that the depthwise separable convolutional neural network provides a superior performance than the other monocular visual depth estimation methods. Therefore, applying depthwise separable convolution layers in the neural network is a more effective and accurate approach for estimating the visual depth.http://dx.doi.org/10.1155/2019/9340129
collection DOAJ
language English
format Article
sources DOAJ
author Jianzhong Yuan
Wujie Zhou
Sijia Lv
Yuzhen Chen
spellingShingle Jianzhong Yuan
Wujie Zhou
Sijia Lv
Yuzhen Chen
Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural Network
Journal of Electrical and Computer Engineering
author_facet Jianzhong Yuan
Wujie Zhou
Sijia Lv
Yuzhen Chen
author_sort Jianzhong Yuan
title Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural Network
title_short Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural Network
title_full Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural Network
title_fullStr Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural Network
title_full_unstemmed Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural Network
title_sort traffic scene depth analysis based on depthwise separable convolutional neural network
publisher Hindawi Limited
series Journal of Electrical and Computer Engineering
issn 2090-0147
2090-0155
publishDate 2019-01-01
description In order to obtain the distances between the surrounding objects and the vehicle in the traffic scene in front of the vehicle, a monocular visual depth estimation method based on the depthwise separable convolutional neural network is proposed in this study. First, features containing shallow depth information were extracted from the RGB images using the convolution layers and maximum pooling layers. Subsampling operations were also performed on these images. Subsequently, features containing advanced depth information were extracted using a block based on an ensemble of convolution layers and a block based on depth separable convolution layers. The output from all different blocks is combined afterwards. Finally, transposed convolution layers were used for upsampling the feature maps to the same size with the original RGB image. During the upsampling process, skip connections were used to merge the features containing shallow depth information that was obtained from the convolution operation through the depthwise separable convolution layers. The depthwise separable convolution layers can provide more accurate depth information features for estimating the monocular visual depth. At the same time, they require reduced computational cost and fewer parameter numbers while providing a similar level (or slightly better) computing performance. Integrating multiple simple convolutions into a block not only increases the overall depth of the neural network but also enables a more accurate extraction of the advanced features in the neural network. Combining the output from multiple blocks can prevent the loss of features containing important depth information. The testing results show that the depthwise separable convolutional neural network provides a superior performance than the other monocular visual depth estimation methods. Therefore, applying depthwise separable convolution layers in the neural network is a more effective and accurate approach for estimating the visual depth.
url http://dx.doi.org/10.1155/2019/9340129
work_keys_str_mv AT jianzhongyuan trafficscenedepthanalysisbasedondepthwiseseparableconvolutionalneuralnetwork
AT wujiezhou trafficscenedepthanalysisbasedondepthwiseseparableconvolutionalneuralnetwork
AT sijialv trafficscenedepthanalysisbasedondepthwiseseparableconvolutionalneuralnetwork
AT yuzhenchen trafficscenedepthanalysisbasedondepthwiseseparableconvolutionalneuralnetwork
_version_ 1721337048441290752