Parallel global convolutional network for semantic image segmentation

Abstract In this paper, a novel convolutional neural network for fast semantic segmentation is presented. Deep convolutional neural networks have achieved great progress in the task of vision scene understanding. While the increase of the accuracy mainly depends on the increase of depth and width. T...

Full description

Bibliographic Details
Main Authors: Xing Bai, Jun Zhou
Format: Article
Language:English
Published: Wiley 2021-01-01
Series:IET Image Processing
Online Access:https://doi.org/10.1049/ipr2.12025
id doaj-a6613230acf041fea560302e77f00c1f
record_format Article
spelling doaj-a6613230acf041fea560302e77f00c1f2021-07-14T13:25:38ZengWileyIET Image Processing1751-96591751-96672021-01-0115125225910.1049/ipr2.12025Parallel global convolutional network for semantic image segmentationXing Bai0Jun Zhou1Key Laboratory of Speech Acoustics and Content Understanding Institute of Acoustics Chinese Academy of Sciences Beijing ChinaKey Laboratory of Speech Acoustics and Content Understanding Institute of Acoustics Chinese Academy of Sciences Beijing ChinaAbstract In this paper, a novel convolutional neural network for fast semantic segmentation is presented. Deep convolutional neural networks have achieved great progress in the task of vision scene understanding. While the increase of the accuracy mainly depends on the increase of depth and width. This slows down large networks and consumes power. A fast and efficient convolutional neural network, PGCNet, aiming at segmenting high‐resolution images with a high speed is introduced. Compared with the competitive methods, the generated model has high performance with fewer parameters and floating point operations. First, a lightweight general architecture pre‐trained on ImageNet is relied on as the main encoder. Then, a novel lateral connection module to better transmit features from encoder to decoder. Third, a powerful method termed as PGCN block to extract features of each block in the encoder is proposed and an edge decoder is applied as a supervision for pixels on the edge of stuff and things during training. Experiments show that this method has great advantages. Based on the proposed PGCNet, 75.8% mean IoU is achieved on the cityscapes test set and 35.4 Hz on a standard Cityscapes image on GTX1080Ti.https://doi.org/10.1049/ipr2.12025
collection DOAJ
language English
format Article
sources DOAJ
author Xing Bai
Jun Zhou
spellingShingle Xing Bai
Jun Zhou
Parallel global convolutional network for semantic image segmentation
IET Image Processing
author_facet Xing Bai
Jun Zhou
author_sort Xing Bai
title Parallel global convolutional network for semantic image segmentation
title_short Parallel global convolutional network for semantic image segmentation
title_full Parallel global convolutional network for semantic image segmentation
title_fullStr Parallel global convolutional network for semantic image segmentation
title_full_unstemmed Parallel global convolutional network for semantic image segmentation
title_sort parallel global convolutional network for semantic image segmentation
publisher Wiley
series IET Image Processing
issn 1751-9659
1751-9667
publishDate 2021-01-01
description Abstract In this paper, a novel convolutional neural network for fast semantic segmentation is presented. Deep convolutional neural networks have achieved great progress in the task of vision scene understanding. While the increase of the accuracy mainly depends on the increase of depth and width. This slows down large networks and consumes power. A fast and efficient convolutional neural network, PGCNet, aiming at segmenting high‐resolution images with a high speed is introduced. Compared with the competitive methods, the generated model has high performance with fewer parameters and floating point operations. First, a lightweight general architecture pre‐trained on ImageNet is relied on as the main encoder. Then, a novel lateral connection module to better transmit features from encoder to decoder. Third, a powerful method termed as PGCN block to extract features of each block in the encoder is proposed and an edge decoder is applied as a supervision for pixels on the edge of stuff and things during training. Experiments show that this method has great advantages. Based on the proposed PGCNet, 75.8% mean IoU is achieved on the cityscapes test set and 35.4 Hz on a standard Cityscapes image on GTX1080Ti.
url https://doi.org/10.1049/ipr2.12025
work_keys_str_mv AT xingbai parallelglobalconvolutionalnetworkforsemanticimagesegmentation
AT junzhou parallelglobalconvolutionalnetworkforsemanticimagesegmentation
_version_ 1721302730931175424