Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local Block

Extracting buildings automatically from high-resolution aerial images is a significant and fundamental task for various practical applications, such as land-use statistics and urban planning. Recently, various methods based on deep learning, especially the fully convolution networks, achieve impress...

Full description

Bibliographic Details
Main Authors: Shengsheng Wang, Xiaowei Hou, Xin Zhao
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8950134/
id doaj-a481e9b2e886482082433e197569cc5b
record_format Article
spelling doaj-a481e9b2e886482082433e197569cc5b2021-03-30T01:20:19ZengIEEEIEEE Access2169-35362020-01-0187313732210.1109/ACCESS.2020.29640438950134Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local BlockShengsheng Wang0https://orcid.org/0000-0002-8503-8061Xiaowei Hou1https://orcid.org/0000-0002-3313-3972Xin Zhao2https://orcid.org/0000-0003-2176-7537College of Computer Science and Technology, Jilin University, Changchun, ChinaCollege of Computer Science and Technology, Jilin University, Changchun, ChinaCollege of Computer Science and Technology, Jilin University, Changchun, ChinaExtracting buildings automatically from high-resolution aerial images is a significant and fundamental task for various practical applications, such as land-use statistics and urban planning. Recently, various methods based on deep learning, especially the fully convolution networks, achieve impressive scores in this challenging semantic segmentation task. However, the lack of global contextual information and the careless upsampling method limit the further improvement of the performance for building extraction task. To simultaneously address these problems, we propose a novel network named Efficient Non-local Residual U-shape Network(ENRU-Net), which is composed of a well designed U-shape encoder-decoder structure and an improved non-local block named asymmetric pyramid non-local block (APNB). The encoder-decoder structure is adopted to extract and restore the feature maps carefully, and APNB could capture global contextual information by utilizing self-attention mechanism. We evaluate the proposed ENRU-Net and compare it with other state-of-the-art models on two widely-used public aerial building imagery datasets: the Massachusetts Buildings Dataset and the WHU Aerial Imagery Dataset. The experiments show that the accuracy of ENRU-Net on these datasets has remarkable improvement against previous state-of-the-art semantic segmentation models, including FCN-8s, U-Net, SegNet and Deeplab v3. The subsequent analysis also indicates that our ENRU-Net has advantages in efficiency for building extraction from high-resolution aerial images.https://ieeexplore.ieee.org/document/8950134/Deep learningsemantic segmentationfully convolution networkbuilding extractionnon-local method
collection DOAJ
language English
format Article
sources DOAJ
author Shengsheng Wang
Xiaowei Hou
Xin Zhao
spellingShingle Shengsheng Wang
Xiaowei Hou
Xin Zhao
Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local Block
IEEE Access
Deep learning
semantic segmentation
fully convolution network
building extraction
non-local method
author_facet Shengsheng Wang
Xiaowei Hou
Xin Zhao
author_sort Shengsheng Wang
title Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local Block
title_short Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local Block
title_full Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local Block
title_fullStr Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local Block
title_full_unstemmed Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local Block
title_sort automatic building extraction from high-resolution aerial imagery via fully convolutional encoder-decoder network with non-local block
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description Extracting buildings automatically from high-resolution aerial images is a significant and fundamental task for various practical applications, such as land-use statistics and urban planning. Recently, various methods based on deep learning, especially the fully convolution networks, achieve impressive scores in this challenging semantic segmentation task. However, the lack of global contextual information and the careless upsampling method limit the further improvement of the performance for building extraction task. To simultaneously address these problems, we propose a novel network named Efficient Non-local Residual U-shape Network(ENRU-Net), which is composed of a well designed U-shape encoder-decoder structure and an improved non-local block named asymmetric pyramid non-local block (APNB). The encoder-decoder structure is adopted to extract and restore the feature maps carefully, and APNB could capture global contextual information by utilizing self-attention mechanism. We evaluate the proposed ENRU-Net and compare it with other state-of-the-art models on two widely-used public aerial building imagery datasets: the Massachusetts Buildings Dataset and the WHU Aerial Imagery Dataset. The experiments show that the accuracy of ENRU-Net on these datasets has remarkable improvement against previous state-of-the-art semantic segmentation models, including FCN-8s, U-Net, SegNet and Deeplab v3. The subsequent analysis also indicates that our ENRU-Net has advantages in efficiency for building extraction from high-resolution aerial images.
topic Deep learning
semantic segmentation
fully convolution network
building extraction
non-local method
url https://ieeexplore.ieee.org/document/8950134/
work_keys_str_mv AT shengshengwang automaticbuildingextractionfromhighresolutionaerialimageryviafullyconvolutionalencoderdecodernetworkwithnonlocalblock
AT xiaoweihou automaticbuildingextractionfromhighresolutionaerialimageryviafullyconvolutionalencoderdecodernetworkwithnonlocalblock
AT xinzhao automaticbuildingextractionfromhighresolutionaerialimageryviafullyconvolutionalencoderdecodernetworkwithnonlocalblock
_version_ 1724187275688935424