Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation

It is challenging for semantic segmentation of buildings based on high-resolution remote sensing images, given high variability of appearance and complicated backgrounds of the buildings and their images. In this communication, we proposed an ensemble multi-scale residual deep learning method with t...

Full description

Bibliographic Details
Main Authors:	Chengyi Wang, Lianfa Li
Format:	Article
Language:	English
Published:	MDPI AG 2020-09-01
Series:	Remote Sensing
Subjects:	multiple scales residual deep ensemble learning regularizer shape representation semantic segmentation of buildings
Online Access:	https://www.mdpi.com/2072-4292/12/18/2932

id	doaj-b23cdb2f5e794f71a5c3772bd10d2aa1
record_format	Article
spelling	doaj-b23cdb2f5e794f71a5c3772bd10d2aa12020-11-25T02:31:00ZengMDPI AGRemote Sensing2072-42922020-09-01122932293210.3390/rs12182932Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape RepresentationChengyi Wang0Lianfa Li1National Engineering Research Center for Geomatics, Aerospace Information Research Institute, Chinese Academy of Sciences, Datun Road, Beijing 100101, ChinaState Key Laboratory of Resources and Environmental Information Systems, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Datun Road, Beijing 100101, ChinaIt is challenging for semantic segmentation of buildings based on high-resolution remote sensing images, given high variability of appearance and complicated backgrounds of the buildings and their images. In this communication, we proposed an ensemble multi-scale residual deep learning method with the regularizer of shape representation for semantic segmentation of buildings. Based on the U-Net architecture using residual connections and multi-scale ASPP (atrous spatial pyramid pooling) modules, our method introduced the regularizer of shape representation and ensemble learning of multi-scale models to enhance model training and reduce over-fitting. In our method, the shape representation was coded in an antoencoder that was used to encode and reconstruct the shape characteristics of the buildings. In prediction, we consider multi-scale trained models for different resolution inputs and side effects to obtain an optimal semantic segmentation. With the high-resolution image of the Changshan, an island county in China, we used two-thirds of the study region image to train the model and the remaining one-third for the independent test. We obtained the accuracy of 0.98–0.99, mean intersection over union (MIoU) of 0.91–0.93 and Jaccard coefficient of 0.89–0.92 in validation. In the independent test, our method achieved state-of-the-art performance (MIoU: 0.83; Jaccard index: 0.81). By comparing with the existing representative methods on four different data sets, the proposed method consistently improved the learning process and generalization. The study shows important contributions of ensemble learning of multi-scale residual models and regularizer of shape representation to semantic segmentation of buildings.https://www.mdpi.com/2072-4292/12/18/2932multiple scalesresidual deep ensemble learningregularizershape representationsemantic segmentation of buildings
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Chengyi Wang Lianfa Li
spellingShingle	Chengyi Wang Lianfa Li Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation Remote Sensing multiple scales residual deep ensemble learning regularizer shape representation semantic segmentation of buildings
author_facet	Chengyi Wang Lianfa Li
author_sort	Chengyi Wang
title	Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation
title_short	Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation
title_full	Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation
title_fullStr	Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation
title_full_unstemmed	Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation
title_sort	multi-scale residual deep network for semantic segmentation of buildings with regularizer of shape representation
publisher	MDPI AG
series	Remote Sensing
issn	2072-4292
publishDate	2020-09-01
description	It is challenging for semantic segmentation of buildings based on high-resolution remote sensing images, given high variability of appearance and complicated backgrounds of the buildings and their images. In this communication, we proposed an ensemble multi-scale residual deep learning method with the regularizer of shape representation for semantic segmentation of buildings. Based on the U-Net architecture using residual connections and multi-scale ASPP (atrous spatial pyramid pooling) modules, our method introduced the regularizer of shape representation and ensemble learning of multi-scale models to enhance model training and reduce over-fitting. In our method, the shape representation was coded in an antoencoder that was used to encode and reconstruct the shape characteristics of the buildings. In prediction, we consider multi-scale trained models for different resolution inputs and side effects to obtain an optimal semantic segmentation. With the high-resolution image of the Changshan, an island county in China, we used two-thirds of the study region image to train the model and the remaining one-third for the independent test. We obtained the accuracy of 0.98–0.99, mean intersection over union (MIoU) of 0.91–0.93 and Jaccard coefficient of 0.89–0.92 in validation. In the independent test, our method achieved state-of-the-art performance (MIoU: 0.83; Jaccard index: 0.81). By comparing with the existing representative methods on four different data sets, the proposed method consistently improved the learning process and generalization. The study shows important contributions of ensemble learning of multi-scale residual models and regularizer of shape representation to semantic segmentation of buildings.
topic	multiple scales residual deep ensemble learning regularizer shape representation semantic segmentation of buildings
url	https://www.mdpi.com/2072-4292/12/18/2932
work_keys_str_mv	AT chengyiwang multiscaleresidualdeepnetworkforsemanticsegmentationofbuildingswithregularizerofshaperepresentation AT lianfali multiscaleresidualdeepnetworkforsemanticsegmentationofbuildingswithregularizerofshaperepresentation
_version_	1724826177439268864

Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation

Similar Items