Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local Block
Extracting buildings automatically from high-resolution aerial images is a significant and fundamental task for various practical applications, such as land-use statistics and urban planning. Recently, various methods based on deep learning, especially the fully convolution networks, achieve impress...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8950134/ |
id |
doaj-a481e9b2e886482082433e197569cc5b |
---|---|
record_format |
Article |
spelling |
doaj-a481e9b2e886482082433e197569cc5b2021-03-30T01:20:19ZengIEEEIEEE Access2169-35362020-01-0187313732210.1109/ACCESS.2020.29640438950134Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local BlockShengsheng Wang0https://orcid.org/0000-0002-8503-8061Xiaowei Hou1https://orcid.org/0000-0002-3313-3972Xin Zhao2https://orcid.org/0000-0003-2176-7537College of Computer Science and Technology, Jilin University, Changchun, ChinaCollege of Computer Science and Technology, Jilin University, Changchun, ChinaCollege of Computer Science and Technology, Jilin University, Changchun, ChinaExtracting buildings automatically from high-resolution aerial images is a significant and fundamental task for various practical applications, such as land-use statistics and urban planning. Recently, various methods based on deep learning, especially the fully convolution networks, achieve impressive scores in this challenging semantic segmentation task. However, the lack of global contextual information and the careless upsampling method limit the further improvement of the performance for building extraction task. To simultaneously address these problems, we propose a novel network named Efficient Non-local Residual U-shape Network(ENRU-Net), which is composed of a well designed U-shape encoder-decoder structure and an improved non-local block named asymmetric pyramid non-local block (APNB). The encoder-decoder structure is adopted to extract and restore the feature maps carefully, and APNB could capture global contextual information by utilizing self-attention mechanism. We evaluate the proposed ENRU-Net and compare it with other state-of-the-art models on two widely-used public aerial building imagery datasets: the Massachusetts Buildings Dataset and the WHU Aerial Imagery Dataset. The experiments show that the accuracy of ENRU-Net on these datasets has remarkable improvement against previous state-of-the-art semantic segmentation models, including FCN-8s, U-Net, SegNet and Deeplab v3. The subsequent analysis also indicates that our ENRU-Net has advantages in efficiency for building extraction from high-resolution aerial images.https://ieeexplore.ieee.org/document/8950134/Deep learningsemantic segmentationfully convolution networkbuilding extractionnon-local method |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Shengsheng Wang Xiaowei Hou Xin Zhao |
spellingShingle |
Shengsheng Wang Xiaowei Hou Xin Zhao Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local Block IEEE Access Deep learning semantic segmentation fully convolution network building extraction non-local method |
author_facet |
Shengsheng Wang Xiaowei Hou Xin Zhao |
author_sort |
Shengsheng Wang |
title |
Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local Block |
title_short |
Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local Block |
title_full |
Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local Block |
title_fullStr |
Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local Block |
title_full_unstemmed |
Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local Block |
title_sort |
automatic building extraction from high-resolution aerial imagery via fully convolutional encoder-decoder network with non-local block |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2020-01-01 |
description |
Extracting buildings automatically from high-resolution aerial images is a significant and fundamental task for various practical applications, such as land-use statistics and urban planning. Recently, various methods based on deep learning, especially the fully convolution networks, achieve impressive scores in this challenging semantic segmentation task. However, the lack of global contextual information and the careless upsampling method limit the further improvement of the performance for building extraction task. To simultaneously address these problems, we propose a novel network named Efficient Non-local Residual U-shape Network(ENRU-Net), which is composed of a well designed U-shape encoder-decoder structure and an improved non-local block named asymmetric pyramid non-local block (APNB). The encoder-decoder structure is adopted to extract and restore the feature maps carefully, and APNB could capture global contextual information by utilizing self-attention mechanism. We evaluate the proposed ENRU-Net and compare it with other state-of-the-art models on two widely-used public aerial building imagery datasets: the Massachusetts Buildings Dataset and the WHU Aerial Imagery Dataset. The experiments show that the accuracy of ENRU-Net on these datasets has remarkable improvement against previous state-of-the-art semantic segmentation models, including FCN-8s, U-Net, SegNet and Deeplab v3. The subsequent analysis also indicates that our ENRU-Net has advantages in efficiency for building extraction from high-resolution aerial images. |
topic |
Deep learning semantic segmentation fully convolution network building extraction non-local method |
url |
https://ieeexplore.ieee.org/document/8950134/ |
work_keys_str_mv |
AT shengshengwang automaticbuildingextractionfromhighresolutionaerialimageryviafullyconvolutionalencoderdecodernetworkwithnonlocalblock AT xiaoweihou automaticbuildingextractionfromhighresolutionaerialimageryviafullyconvolutionalencoderdecodernetworkwithnonlocalblock AT xinzhao automaticbuildingextractionfromhighresolutionaerialimageryviafullyconvolutionalencoderdecodernetworkwithnonlocalblock |
_version_ |
1724187275688935424 |