Semantic Segmentation of Aerial Imagery via Split-Attention Networks with Disentangled Nonlocal and Edge Supervision

In this work, we propose a new deep convolution neural network (DCNN) architecture for semantic segmentation of aerial imagery. Taking advantage of recent research, we use split-attention networks (ResNeSt) as the backbone for high-quality feature expression. Additionally, a disentangled nonlocal (D...

Full description

Bibliographic Details
Main Authors: Cheng Zhang, Wanshou Jiang, Qing Zhao
Format: Article
Language:English
Published: MDPI AG 2021-03-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/13/6/1176
id doaj-6b6a05a4d04e4c0a880edcf087278612
record_format Article
spelling doaj-6b6a05a4d04e4c0a880edcf0872786122021-03-20T00:04:38ZengMDPI AGRemote Sensing2072-42922021-03-01131176117610.3390/rs13061176Semantic Segmentation of Aerial Imagery via Split-Attention Networks with Disentangled Nonlocal and Edge SupervisionCheng Zhang0Wanshou Jiang1Qing Zhao2State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaState Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaState Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaIn this work, we propose a new deep convolution neural network (DCNN) architecture for semantic segmentation of aerial imagery. Taking advantage of recent research, we use split-attention networks (ResNeSt) as the backbone for high-quality feature expression. Additionally, a disentangled nonlocal (DNL) block is integrated into our pipeline to express the inter-pixel long-distance dependence and highlight the edge pixels simultaneously. Moreover, the depth-wise separable convolution and atrous spatial pyramid pooling (ASPP) modules are combined to extract and fuse multiscale contextual features. Finally, an auxiliary edge detection task is designed to provide edge constraints for semantic segmentation. Evaluation of algorithms is conducted on two benchmarks provided by the International Society for Photogrammetry and Remote Sensing (ISPRS). Extensive experiments demonstrate the effectiveness of each module of our architecture. Precision evaluation based on the Potsdam benchmark shows that the proposed DCNN achieves competitive performance over the state-of-the-art methods.https://www.mdpi.com/2072-4292/13/6/1176semantic segmentationResNeStedge constrainsdisentangled non-localdepth-wise separable ASPPremote sensing
collection DOAJ
language English
format Article
sources DOAJ
author Cheng Zhang
Wanshou Jiang
Qing Zhao
spellingShingle Cheng Zhang
Wanshou Jiang
Qing Zhao
Semantic Segmentation of Aerial Imagery via Split-Attention Networks with Disentangled Nonlocal and Edge Supervision
Remote Sensing
semantic segmentation
ResNeSt
edge constrains
disentangled non-local
depth-wise separable ASPP
remote sensing
author_facet Cheng Zhang
Wanshou Jiang
Qing Zhao
author_sort Cheng Zhang
title Semantic Segmentation of Aerial Imagery via Split-Attention Networks with Disentangled Nonlocal and Edge Supervision
title_short Semantic Segmentation of Aerial Imagery via Split-Attention Networks with Disentangled Nonlocal and Edge Supervision
title_full Semantic Segmentation of Aerial Imagery via Split-Attention Networks with Disentangled Nonlocal and Edge Supervision
title_fullStr Semantic Segmentation of Aerial Imagery via Split-Attention Networks with Disentangled Nonlocal and Edge Supervision
title_full_unstemmed Semantic Segmentation of Aerial Imagery via Split-Attention Networks with Disentangled Nonlocal and Edge Supervision
title_sort semantic segmentation of aerial imagery via split-attention networks with disentangled nonlocal and edge supervision
publisher MDPI AG
series Remote Sensing
issn 2072-4292
publishDate 2021-03-01
description In this work, we propose a new deep convolution neural network (DCNN) architecture for semantic segmentation of aerial imagery. Taking advantage of recent research, we use split-attention networks (ResNeSt) as the backbone for high-quality feature expression. Additionally, a disentangled nonlocal (DNL) block is integrated into our pipeline to express the inter-pixel long-distance dependence and highlight the edge pixels simultaneously. Moreover, the depth-wise separable convolution and atrous spatial pyramid pooling (ASPP) modules are combined to extract and fuse multiscale contextual features. Finally, an auxiliary edge detection task is designed to provide edge constraints for semantic segmentation. Evaluation of algorithms is conducted on two benchmarks provided by the International Society for Photogrammetry and Remote Sensing (ISPRS). Extensive experiments demonstrate the effectiveness of each module of our architecture. Precision evaluation based on the Potsdam benchmark shows that the proposed DCNN achieves competitive performance over the state-of-the-art methods.
topic semantic segmentation
ResNeSt
edge constrains
disentangled non-local
depth-wise separable ASPP
remote sensing
url https://www.mdpi.com/2072-4292/13/6/1176
work_keys_str_mv AT chengzhang semanticsegmentationofaerialimageryviasplitattentionnetworkswithdisentanglednonlocalandedgesupervision
AT wanshoujiang semanticsegmentationofaerialimageryviasplitattentionnetworkswithdisentanglednonlocalandedgesupervision
AT qingzhao semanticsegmentationofaerialimageryviasplitattentionnetworkswithdisentanglednonlocalandedgesupervision
_version_ 1724212349225664512