Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages

The recent applications of fully convolutional networks (FCNs) have shown to improve the semantic segmentation of very high resolution (VHR) remote-sensing images because of the excellent feature representation and end-to-end pixel labeling capabilities. While many FCN-based methods concatenate feat...

Full description

Bibliographic Details
Main Authors: Jianfeng Huang, Xinchang Zhang, Ying Sun, Qinchuan Xin
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9410460/
id doaj-764513fc4ddb4b0182e1d8982e54c57c
record_format Article
spelling doaj-764513fc4ddb4b0182e1d8982e54c57c2021-06-03T23:07:34ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing2151-15352021-01-01144490450310.1109/JSTARS.2021.30739359410460Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial OrthoimagesJianfeng Huang0https://orcid.org/0000-0002-1940-5708Xinchang Zhang1https://orcid.org/0000-0001-8463-9757Ying Sun2https://orcid.org/0000-0002-9350-021XQinchuan Xin3https://orcid.org/0000-0003-1146-4874Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Atmospheric Sciences, Sun Yat-sen University, Zhuhai, ChinaSchool of Geography and Remote Sensing, Guangzhou University, Guangzhou, ChinaGuangdong Key Laboratory for Urbanization and Geo-simulation and the School of Geography and Planning, Sun Yat-sen University, Guangzhou, ChinaGuangdong Key Laboratory for Urbanization and Geo-simulation and the School of Geography and Planning, Sun Yat-sen University, Guangzhou, ChinaThe recent applications of fully convolutional networks (FCNs) have shown to improve the semantic segmentation of very high resolution (VHR) remote-sensing images because of the excellent feature representation and end-to-end pixel labeling capabilities. While many FCN-based methods concatenate features from multilevel encoding stages to refine the coarse labeling results, the semantic gap between features of different levels and the selection of representative features are often overlooked, leading to the generation of redundant information and unexpected classification results. In this article, we propose an attention-guided label refinement network (ALRNet) for improved semantic labeling of VHR images. ALRNet follows the paradigm of the encoder–decoder architecture, which progressively refines the coarse labeling maps of different scales by using the channelwise attention mechanism. A novel attention-guided feature fusion module based on the squeeze-and-excitation module is designed to fuse higher level and lower level features. In this way, the semantic gaps among features of different levels are declined, and the category discrimination of each pixel in the lower level features is strengthened, which is helpful for subsequent label refinement. ALRNet is tested on three public datasets, including two ISRPS 2-D labeling datasets and the Wuhan University aerial building dataset. Results demonstrated that ALRNet had shown promising segmentation performance in comparison with state-of-the-art deep learning networks. The source code of ALRNet is made publicly available for further studies.https://ieeexplore.ieee.org/document/9410460/Attention mechanismconvolutional neural networks (CNNs)deep learningsemantic segmentationurban object extractionvery high spatial resolution images
collection DOAJ
language English
format Article
sources DOAJ
author Jianfeng Huang
Xinchang Zhang
Ying Sun
Qinchuan Xin
spellingShingle Jianfeng Huang
Xinchang Zhang
Ying Sun
Qinchuan Xin
Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Attention mechanism
convolutional neural networks (CNNs)
deep learning
semantic segmentation
urban object extraction
very high spatial resolution images
author_facet Jianfeng Huang
Xinchang Zhang
Ying Sun
Qinchuan Xin
author_sort Jianfeng Huang
title Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages
title_short Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages
title_full Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages
title_fullStr Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages
title_full_unstemmed Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages
title_sort attention-guided label refinement network for semantic segmentation of very high resolution aerial orthoimages
publisher IEEE
series IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
issn 2151-1535
publishDate 2021-01-01
description The recent applications of fully convolutional networks (FCNs) have shown to improve the semantic segmentation of very high resolution (VHR) remote-sensing images because of the excellent feature representation and end-to-end pixel labeling capabilities. While many FCN-based methods concatenate features from multilevel encoding stages to refine the coarse labeling results, the semantic gap between features of different levels and the selection of representative features are often overlooked, leading to the generation of redundant information and unexpected classification results. In this article, we propose an attention-guided label refinement network (ALRNet) for improved semantic labeling of VHR images. ALRNet follows the paradigm of the encoder–decoder architecture, which progressively refines the coarse labeling maps of different scales by using the channelwise attention mechanism. A novel attention-guided feature fusion module based on the squeeze-and-excitation module is designed to fuse higher level and lower level features. In this way, the semantic gaps among features of different levels are declined, and the category discrimination of each pixel in the lower level features is strengthened, which is helpful for subsequent label refinement. ALRNet is tested on three public datasets, including two ISRPS 2-D labeling datasets and the Wuhan University aerial building dataset. Results demonstrated that ALRNet had shown promising segmentation performance in comparison with state-of-the-art deep learning networks. The source code of ALRNet is made publicly available for further studies.
topic Attention mechanism
convolutional neural networks (CNNs)
deep learning
semantic segmentation
urban object extraction
very high spatial resolution images
url https://ieeexplore.ieee.org/document/9410460/
work_keys_str_mv AT jianfenghuang attentionguidedlabelrefinementnetworkforsemanticsegmentationofveryhighresolutionaerialorthoimages
AT xinchangzhang attentionguidedlabelrefinementnetworkforsemanticsegmentationofveryhighresolutionaerialorthoimages
AT yingsun attentionguidedlabelrefinementnetworkforsemanticsegmentationofveryhighresolutionaerialorthoimages
AT qinchuanxin attentionguidedlabelrefinementnetworkforsemanticsegmentationofveryhighresolutionaerialorthoimages
_version_ 1721398572342050816