Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages

The recent applications of fully convolutional networks (FCNs) have shown to improve the semantic segmentation of very high resolution (VHR) remote-sensing images because of the excellent feature representation and end-to-end pixel labeling capabilities. While many FCN-based methods concatenate feat...

Full description

Bibliographic Details
Main Authors:	Jianfeng Huang, Xinchang Zhang, Ying Sun, Qinchuan Xin
Format:	Article
Language:	English
Published:	IEEE 2021-01-01
Series:	IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:	Attention mechanism convolutional neural networks (CNNs) deep learning semantic segmentation urban object extraction very high spatial resolution images
Online Access:	https://ieeexplore.ieee.org/document/9410460/

id	doaj-764513fc4ddb4b0182e1d8982e54c57c
record_format	Article
spelling	doaj-764513fc4ddb4b0182e1d8982e54c57c2021-06-03T23:07:34ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing2151-15352021-01-01144490450310.1109/JSTARS.2021.30739359410460Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial OrthoimagesJianfeng Huang0https://orcid.org/0000-0002-1940-5708Xinchang Zhang1https://orcid.org/0000-0001-8463-9757Ying Sun2https://orcid.org/0000-0002-9350-021XQinchuan Xin3https://orcid.org/0000-0003-1146-4874Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Atmospheric Sciences, Sun Yat-sen University, Zhuhai, ChinaSchool of Geography and Remote Sensing, Guangzhou University, Guangzhou, ChinaGuangdong Key Laboratory for Urbanization and Geo-simulation and the School of Geography and Planning, Sun Yat-sen University, Guangzhou, ChinaGuangdong Key Laboratory for Urbanization and Geo-simulation and the School of Geography and Planning, Sun Yat-sen University, Guangzhou, ChinaThe recent applications of fully convolutional networks (FCNs) have shown to improve the semantic segmentation of very high resolution (VHR) remote-sensing images because of the excellent feature representation and end-to-end pixel labeling capabilities. While many FCN-based methods concatenate features from multilevel encoding stages to refine the coarse labeling results, the semantic gap between features of different levels and the selection of representative features are often overlooked, leading to the generation of redundant information and unexpected classification results. In this article, we propose an attention-guided label refinement network (ALRNet) for improved semantic labeling of VHR images. ALRNet follows the paradigm of the encoder–decoder architecture, which progressively refines the coarse labeling maps of different scales by using the channelwise attention mechanism. A novel attention-guided feature fusion module based on the squeeze-and-excitation module is designed to fuse higher level and lower level features. In this way, the semantic gaps among features of different levels are declined, and the category discrimination of each pixel in the lower level features is strengthened, which is helpful for subsequent label refinement. ALRNet is tested on three public datasets, including two ISRPS 2-D labeling datasets and the Wuhan University aerial building dataset. Results demonstrated that ALRNet had shown promising segmentation performance in comparison with state-of-the-art deep learning networks. The source code of ALRNet is made publicly available for further studies.https://ieeexplore.ieee.org/document/9410460/Attention mechanismconvolutional neural networks (CNNs)deep learningsemantic segmentationurban object extractionvery high spatial resolution images
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Jianfeng Huang Xinchang Zhang Ying Sun Qinchuan Xin
spellingShingle	Jianfeng Huang Xinchang Zhang Ying Sun Qinchuan Xin Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Attention mechanism convolutional neural networks (CNNs) deep learning semantic segmentation urban object extraction very high spatial resolution images
author_facet	Jianfeng Huang Xinchang Zhang Ying Sun Qinchuan Xin
author_sort	Jianfeng Huang
title	Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages
title_short	Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages
title_full	Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages
title_fullStr	Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages
title_full_unstemmed	Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages
title_sort	attention-guided label refinement network for semantic segmentation of very high resolution aerial orthoimages
publisher	IEEE
series	IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
issn	2151-1535
publishDate	2021-01-01
description	The recent applications of fully convolutional networks (FCNs) have shown to improve the semantic segmentation of very high resolution (VHR) remote-sensing images because of the excellent feature representation and end-to-end pixel labeling capabilities. While many FCN-based methods concatenate features from multilevel encoding stages to refine the coarse labeling results, the semantic gap between features of different levels and the selection of representative features are often overlooked, leading to the generation of redundant information and unexpected classification results. In this article, we propose an attention-guided label refinement network (ALRNet) for improved semantic labeling of VHR images. ALRNet follows the paradigm of the encoder–decoder architecture, which progressively refines the coarse labeling maps of different scales by using the channelwise attention mechanism. A novel attention-guided feature fusion module based on the squeeze-and-excitation module is designed to fuse higher level and lower level features. In this way, the semantic gaps among features of different levels are declined, and the category discrimination of each pixel in the lower level features is strengthened, which is helpful for subsequent label refinement. ALRNet is tested on three public datasets, including two ISRPS 2-D labeling datasets and the Wuhan University aerial building dataset. Results demonstrated that ALRNet had shown promising segmentation performance in comparison with state-of-the-art deep learning networks. The source code of ALRNet is made publicly available for further studies.
topic	Attention mechanism convolutional neural networks (CNNs) deep learning semantic segmentation urban object extraction very high spatial resolution images
url	https://ieeexplore.ieee.org/document/9410460/
work_keys_str_mv	AT jianfenghuang attentionguidedlabelrefinementnetworkforsemanticsegmentationofveryhighresolutionaerialorthoimages AT xinchangzhang attentionguidedlabelrefinementnetworkforsemanticsegmentationofveryhighresolutionaerialorthoimages AT yingsun attentionguidedlabelrefinementnetworkforsemanticsegmentationofveryhighresolutionaerialorthoimages AT qinchuanxin attentionguidedlabelrefinementnetworkforsemanticsegmentationofveryhighresolutionaerialorthoimages
_version_	1721398572342050816

Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages

Similar Items