A Hybrid Attention-Aware Fusion Network (HAFNet) for Building Extraction from High-Resolution Imagery and LiDAR Data

Automated extraction of buildings from earth observation (EO) data has long been a fundamental but challenging research topic. Combining data from different modalities (e.g., high-resolution imagery (HRI) and light detection and ranging (LiDAR) data) has shown great potential in building extraction....

Full description

Bibliographic Details
Main Authors:	Peng Zhang, Peijun Du, Cong Lin, Xin Wang, Erzhu Li, Zhaohui Xue, Xuyu Bai
Format:	Article
Language:	English
Published:	MDPI AG 2020-11-01
Series:	Remote Sensing
Subjects:	building extraction high-resolution imagery (HRI) light detection and ranging (LiDAR) multimodal data fusion deep learning attention mechanism
Online Access:	https://www.mdpi.com/2072-4292/12/22/3764

id	doaj-88cd67ac09e54848a2f38c440ce3b732
record_format	Article
spelling	doaj-88cd67ac09e54848a2f38c440ce3b7322020-11-25T04:02:44ZengMDPI AGRemote Sensing2072-42922020-11-01123764376410.3390/rs12223764A Hybrid Attention-Aware Fusion Network (HAFNet) for Building Extraction from High-Resolution Imagery and LiDAR DataPeng Zhang0Peijun Du1Cong Lin2Xin Wang3Erzhu Li4Zhaohui Xue5Xuyu Bai6Jiangsu Provincial Key Laboratory of Geographic Information Science and Technology, Key Laboratory for Land Satellite Remote Sensing Applications of Ministry of Natural Resources, School of Geography and Ocean Science, Nanjing University, Nanjing 210023, ChinaJiangsu Provincial Key Laboratory of Geographic Information Science and Technology, Key Laboratory for Land Satellite Remote Sensing Applications of Ministry of Natural Resources, School of Geography and Ocean Science, Nanjing University, Nanjing 210023, ChinaJiangsu Provincial Key Laboratory of Geographic Information Science and Technology, Key Laboratory for Land Satellite Remote Sensing Applications of Ministry of Natural Resources, School of Geography and Ocean Science, Nanjing University, Nanjing 210023, ChinaJiangsu Provincial Key Laboratory of Geographic Information Science and Technology, Key Laboratory for Land Satellite Remote Sensing Applications of Ministry of Natural Resources, School of Geography and Ocean Science, Nanjing University, Nanjing 210023, ChinaSchool of Geography, Geomatics and Planning, Jiangsu Normal University, Xuzhou 221116, ChinaSchool of Earth Sciences and Engineering, Hohai University, Nanjing 211100, ChinaJiangsu Provincial Key Laboratory of Geographic Information Science and Technology, Key Laboratory for Land Satellite Remote Sensing Applications of Ministry of Natural Resources, School of Geography and Ocean Science, Nanjing University, Nanjing 210023, ChinaAutomated extraction of buildings from earth observation (EO) data has long been a fundamental but challenging research topic. Combining data from different modalities (e.g., high-resolution imagery (HRI) and light detection and ranging (LiDAR) data) has shown great potential in building extraction. Recent studies have examined the role that deep learning (DL) could play in both multimodal data fusion and urban object extraction. However, DL-based multimodal fusion networks may encounter the following limitations: (1) the individual modal and cross-modal features, which we consider both useful and important for final prediction, cannot be sufficiently learned and utilized and (2) the multimodal features are fused by a simple summation or concatenation, which appears ambiguous in selecting cross-modal complementary information. In this paper, we address these two limitations by proposing a hybrid attention-aware fusion network (HAFNet) for building extraction. It consists of RGB-specific, digital surface model (DSM)-specific, and cross-modal streams to sufficiently learn and utilize both individual modal and cross-modal features. Furthermore, an attention-aware multimodal fusion block (Att-MFBlock) was introduced to overcome the fusion problem by adaptively selecting and combining complementary features from each modality. Extensive experiments conducted on two publicly available datasets demonstrated the effectiveness of the proposed HAFNet for building extraction.https://www.mdpi.com/2072-4292/12/22/3764building extractionhigh-resolution imagery (HRI)light detection and ranging (LiDAR)multimodal data fusiondeep learningattention mechanism
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Peng Zhang Peijun Du Cong Lin Xin Wang Erzhu Li Zhaohui Xue Xuyu Bai
spellingShingle	Peng Zhang Peijun Du Cong Lin Xin Wang Erzhu Li Zhaohui Xue Xuyu Bai A Hybrid Attention-Aware Fusion Network (HAFNet) for Building Extraction from High-Resolution Imagery and LiDAR Data Remote Sensing building extraction high-resolution imagery (HRI) light detection and ranging (LiDAR) multimodal data fusion deep learning attention mechanism
author_facet	Peng Zhang Peijun Du Cong Lin Xin Wang Erzhu Li Zhaohui Xue Xuyu Bai
author_sort	Peng Zhang
title	A Hybrid Attention-Aware Fusion Network (HAFNet) for Building Extraction from High-Resolution Imagery and LiDAR Data
title_short	A Hybrid Attention-Aware Fusion Network (HAFNet) for Building Extraction from High-Resolution Imagery and LiDAR Data
title_full	A Hybrid Attention-Aware Fusion Network (HAFNet) for Building Extraction from High-Resolution Imagery and LiDAR Data
title_fullStr	A Hybrid Attention-Aware Fusion Network (HAFNet) for Building Extraction from High-Resolution Imagery and LiDAR Data
title_full_unstemmed	A Hybrid Attention-Aware Fusion Network (HAFNet) for Building Extraction from High-Resolution Imagery and LiDAR Data
title_sort	hybrid attention-aware fusion network (hafnet) for building extraction from high-resolution imagery and lidar data
publisher	MDPI AG
series	Remote Sensing
issn	2072-4292
publishDate	2020-11-01
description	Automated extraction of buildings from earth observation (EO) data has long been a fundamental but challenging research topic. Combining data from different modalities (e.g., high-resolution imagery (HRI) and light detection and ranging (LiDAR) data) has shown great potential in building extraction. Recent studies have examined the role that deep learning (DL) could play in both multimodal data fusion and urban object extraction. However, DL-based multimodal fusion networks may encounter the following limitations: (1) the individual modal and cross-modal features, which we consider both useful and important for final prediction, cannot be sufficiently learned and utilized and (2) the multimodal features are fused by a simple summation or concatenation, which appears ambiguous in selecting cross-modal complementary information. In this paper, we address these two limitations by proposing a hybrid attention-aware fusion network (HAFNet) for building extraction. It consists of RGB-specific, digital surface model (DSM)-specific, and cross-modal streams to sufficiently learn and utilize both individual modal and cross-modal features. Furthermore, an attention-aware multimodal fusion block (Att-MFBlock) was introduced to overcome the fusion problem by adaptively selecting and combining complementary features from each modality. Extensive experiments conducted on two publicly available datasets demonstrated the effectiveness of the proposed HAFNet for building extraction.
topic	building extraction high-resolution imagery (HRI) light detection and ranging (LiDAR) multimodal data fusion deep learning attention mechanism
url	https://www.mdpi.com/2072-4292/12/22/3764
work_keys_str_mv	AT pengzhang ahybridattentionawarefusionnetworkhafnetforbuildingextractionfromhighresolutionimageryandlidardata AT peijundu ahybridattentionawarefusionnetworkhafnetforbuildingextractionfromhighresolutionimageryandlidardata AT conglin ahybridattentionawarefusionnetworkhafnetforbuildingextractionfromhighresolutionimageryandlidardata AT xinwang ahybridattentionawarefusionnetworkhafnetforbuildingextractionfromhighresolutionimageryandlidardata AT erzhuli ahybridattentionawarefusionnetworkhafnetforbuildingextractionfromhighresolutionimageryandlidardata AT zhaohuixue ahybridattentionawarefusionnetworkhafnetforbuildingextractionfromhighresolutionimageryandlidardata AT xuyubai ahybridattentionawarefusionnetworkhafnetforbuildingextractionfromhighresolutionimageryandlidardata AT pengzhang hybridattentionawarefusionnetworkhafnetforbuildingextractionfromhighresolutionimageryandlidardata AT peijundu hybridattentionawarefusionnetworkhafnetforbuildingextractionfromhighresolutionimageryandlidardata AT conglin hybridattentionawarefusionnetworkhafnetforbuildingextractionfromhighresolutionimageryandlidardata AT xinwang hybridattentionawarefusionnetworkhafnetforbuildingextractionfromhighresolutionimageryandlidardata AT erzhuli hybridattentionawarefusionnetworkhafnetforbuildingextractionfromhighresolutionimageryandlidardata AT zhaohuixue hybridattentionawarefusionnetworkhafnetforbuildingextractionfromhighresolutionimageryandlidardata AT xuyubai hybridattentionawarefusionnetworkhafnetforbuildingextractionfromhighresolutionimageryandlidardata
_version_	1724442445065748480

A Hybrid Attention-Aware Fusion Network (HAFNet) for Building Extraction from High-Resolution Imagery and LiDAR Data

Similar Items