Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages
The recent applications of fully convolutional networks (FCNs) have shown to improve the semantic segmentation of very high resolution (VHR) remote-sensing images because of the excellent feature representation and end-to-end pixel labeling capabilities. While many FCN-based methods concatenate feat...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9410460/ |
id |
doaj-764513fc4ddb4b0182e1d8982e54c57c |
---|---|
record_format |
Article |
spelling |
doaj-764513fc4ddb4b0182e1d8982e54c57c2021-06-03T23:07:34ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing2151-15352021-01-01144490450310.1109/JSTARS.2021.30739359410460Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial OrthoimagesJianfeng Huang0https://orcid.org/0000-0002-1940-5708Xinchang Zhang1https://orcid.org/0000-0001-8463-9757Ying Sun2https://orcid.org/0000-0002-9350-021XQinchuan Xin3https://orcid.org/0000-0003-1146-4874Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Atmospheric Sciences, Sun Yat-sen University, Zhuhai, ChinaSchool of Geography and Remote Sensing, Guangzhou University, Guangzhou, ChinaGuangdong Key Laboratory for Urbanization and Geo-simulation and the School of Geography and Planning, Sun Yat-sen University, Guangzhou, ChinaGuangdong Key Laboratory for Urbanization and Geo-simulation and the School of Geography and Planning, Sun Yat-sen University, Guangzhou, ChinaThe recent applications of fully convolutional networks (FCNs) have shown to improve the semantic segmentation of very high resolution (VHR) remote-sensing images because of the excellent feature representation and end-to-end pixel labeling capabilities. While many FCN-based methods concatenate features from multilevel encoding stages to refine the coarse labeling results, the semantic gap between features of different levels and the selection of representative features are often overlooked, leading to the generation of redundant information and unexpected classification results. In this article, we propose an attention-guided label refinement network (ALRNet) for improved semantic labeling of VHR images. ALRNet follows the paradigm of the encoder–decoder architecture, which progressively refines the coarse labeling maps of different scales by using the channelwise attention mechanism. A novel attention-guided feature fusion module based on the squeeze-and-excitation module is designed to fuse higher level and lower level features. In this way, the semantic gaps among features of different levels are declined, and the category discrimination of each pixel in the lower level features is strengthened, which is helpful for subsequent label refinement. ALRNet is tested on three public datasets, including two ISRPS 2-D labeling datasets and the Wuhan University aerial building dataset. Results demonstrated that ALRNet had shown promising segmentation performance in comparison with state-of-the-art deep learning networks. The source code of ALRNet is made publicly available for further studies.https://ieeexplore.ieee.org/document/9410460/Attention mechanismconvolutional neural networks (CNNs)deep learningsemantic segmentationurban object extractionvery high spatial resolution images |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Jianfeng Huang Xinchang Zhang Ying Sun Qinchuan Xin |
spellingShingle |
Jianfeng Huang Xinchang Zhang Ying Sun Qinchuan Xin Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Attention mechanism convolutional neural networks (CNNs) deep learning semantic segmentation urban object extraction very high spatial resolution images |
author_facet |
Jianfeng Huang Xinchang Zhang Ying Sun Qinchuan Xin |
author_sort |
Jianfeng Huang |
title |
Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages |
title_short |
Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages |
title_full |
Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages |
title_fullStr |
Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages |
title_full_unstemmed |
Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages |
title_sort |
attention-guided label refinement network for semantic segmentation of very high resolution aerial orthoimages |
publisher |
IEEE |
series |
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
issn |
2151-1535 |
publishDate |
2021-01-01 |
description |
The recent applications of fully convolutional networks (FCNs) have shown to improve the semantic segmentation of very high resolution (VHR) remote-sensing images because of the excellent feature representation and end-to-end pixel labeling capabilities. While many FCN-based methods concatenate features from multilevel encoding stages to refine the coarse labeling results, the semantic gap between features of different levels and the selection of representative features are often overlooked, leading to the generation of redundant information and unexpected classification results. In this article, we propose an attention-guided label refinement network (ALRNet) for improved semantic labeling of VHR images. ALRNet follows the paradigm of the encoder–decoder architecture, which progressively refines the coarse labeling maps of different scales by using the channelwise attention mechanism. A novel attention-guided feature fusion module based on the squeeze-and-excitation module is designed to fuse higher level and lower level features. In this way, the semantic gaps among features of different levels are declined, and the category discrimination of each pixel in the lower level features is strengthened, which is helpful for subsequent label refinement. ALRNet is tested on three public datasets, including two ISRPS 2-D labeling datasets and the Wuhan University aerial building dataset. Results demonstrated that ALRNet had shown promising segmentation performance in comparison with state-of-the-art deep learning networks. The source code of ALRNet is made publicly available for further studies. |
topic |
Attention mechanism convolutional neural networks (CNNs) deep learning semantic segmentation urban object extraction very high spatial resolution images |
url |
https://ieeexplore.ieee.org/document/9410460/ |
work_keys_str_mv |
AT jianfenghuang attentionguidedlabelrefinementnetworkforsemanticsegmentationofveryhighresolutionaerialorthoimages AT xinchangzhang attentionguidedlabelrefinementnetworkforsemanticsegmentationofveryhighresolutionaerialorthoimages AT yingsun attentionguidedlabelrefinementnetworkforsemanticsegmentationofveryhighresolutionaerialorthoimages AT qinchuanxin attentionguidedlabelrefinementnetworkforsemanticsegmentationofveryhighresolutionaerialorthoimages |
_version_ |
1721398572342050816 |