Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation
It is challenging for semantic segmentation of buildings based on high-resolution remote sensing images, given high variability of appearance and complicated backgrounds of the buildings and their images. In this communication, we proposed an ensemble multi-scale residual deep learning method with t...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-09-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/12/18/2932 |
id |
doaj-b23cdb2f5e794f71a5c3772bd10d2aa1 |
---|---|
record_format |
Article |
spelling |
doaj-b23cdb2f5e794f71a5c3772bd10d2aa12020-11-25T02:31:00ZengMDPI AGRemote Sensing2072-42922020-09-01122932293210.3390/rs12182932Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape RepresentationChengyi Wang0Lianfa Li1National Engineering Research Center for Geomatics, Aerospace Information Research Institute, Chinese Academy of Sciences, Datun Road, Beijing 100101, ChinaState Key Laboratory of Resources and Environmental Information Systems, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Datun Road, Beijing 100101, ChinaIt is challenging for semantic segmentation of buildings based on high-resolution remote sensing images, given high variability of appearance and complicated backgrounds of the buildings and their images. In this communication, we proposed an ensemble multi-scale residual deep learning method with the regularizer of shape representation for semantic segmentation of buildings. Based on the U-Net architecture using residual connections and multi-scale ASPP (atrous spatial pyramid pooling) modules, our method introduced the regularizer of shape representation and ensemble learning of multi-scale models to enhance model training and reduce over-fitting. In our method, the shape representation was coded in an antoencoder that was used to encode and reconstruct the shape characteristics of the buildings. In prediction, we consider multi-scale trained models for different resolution inputs and side effects to obtain an optimal semantic segmentation. With the high-resolution image of the Changshan, an island county in China, we used two-thirds of the study region image to train the model and the remaining one-third for the independent test. We obtained the accuracy of 0.98–0.99, mean intersection over union (MIoU) of 0.91–0.93 and Jaccard coefficient of 0.89–0.92 in validation. In the independent test, our method achieved state-of-the-art performance (MIoU: 0.83; Jaccard index: 0.81). By comparing with the existing representative methods on four different data sets, the proposed method consistently improved the learning process and generalization. The study shows important contributions of ensemble learning of multi-scale residual models and regularizer of shape representation to semantic segmentation of buildings.https://www.mdpi.com/2072-4292/12/18/2932multiple scalesresidual deep ensemble learningregularizershape representationsemantic segmentation of buildings |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Chengyi Wang Lianfa Li |
spellingShingle |
Chengyi Wang Lianfa Li Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation Remote Sensing multiple scales residual deep ensemble learning regularizer shape representation semantic segmentation of buildings |
author_facet |
Chengyi Wang Lianfa Li |
author_sort |
Chengyi Wang |
title |
Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation |
title_short |
Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation |
title_full |
Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation |
title_fullStr |
Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation |
title_full_unstemmed |
Multi-Scale Residual Deep Network for Semantic Segmentation of Buildings with Regularizer of Shape Representation |
title_sort |
multi-scale residual deep network for semantic segmentation of buildings with regularizer of shape representation |
publisher |
MDPI AG |
series |
Remote Sensing |
issn |
2072-4292 |
publishDate |
2020-09-01 |
description |
It is challenging for semantic segmentation of buildings based on high-resolution remote sensing images, given high variability of appearance and complicated backgrounds of the buildings and their images. In this communication, we proposed an ensemble multi-scale residual deep learning method with the regularizer of shape representation for semantic segmentation of buildings. Based on the U-Net architecture using residual connections and multi-scale ASPP (atrous spatial pyramid pooling) modules, our method introduced the regularizer of shape representation and ensemble learning of multi-scale models to enhance model training and reduce over-fitting. In our method, the shape representation was coded in an antoencoder that was used to encode and reconstruct the shape characteristics of the buildings. In prediction, we consider multi-scale trained models for different resolution inputs and side effects to obtain an optimal semantic segmentation. With the high-resolution image of the Changshan, an island county in China, we used two-thirds of the study region image to train the model and the remaining one-third for the independent test. We obtained the accuracy of 0.98–0.99, mean intersection over union (MIoU) of 0.91–0.93 and Jaccard coefficient of 0.89–0.92 in validation. In the independent test, our method achieved state-of-the-art performance (MIoU: 0.83; Jaccard index: 0.81). By comparing with the existing representative methods on four different data sets, the proposed method consistently improved the learning process and generalization. The study shows important contributions of ensemble learning of multi-scale residual models and regularizer of shape representation to semantic segmentation of buildings. |
topic |
multiple scales residual deep ensemble learning regularizer shape representation semantic segmentation of buildings |
url |
https://www.mdpi.com/2072-4292/12/18/2932 |
work_keys_str_mv |
AT chengyiwang multiscaleresidualdeepnetworkforsemanticsegmentationofbuildingswithregularizerofshaperepresentation AT lianfali multiscaleresidualdeepnetworkforsemanticsegmentationofbuildingswithregularizerofshaperepresentation |
_version_ |
1724826177439268864 |