Structural inference embedded adversarial networks for scene parsing.
Explicit structural inference is one key point to improve the accuracy of scene parsing. Meanwhile, adversarial training method is able to reinforce spatial contiguity in output segmentations. To take both advantages of the structural learning and adversarial training simultaneously, we propose a no...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2018-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC5896926?pdf=render |
id |
doaj-479e892ee9ec444b9b3a8e4323fae34b |
---|---|
record_format |
Article |
spelling |
doaj-479e892ee9ec444b9b3a8e4323fae34b2020-11-24T22:11:46ZengPublic Library of Science (PLoS)PLoS ONE1932-62032018-01-01134e019511410.1371/journal.pone.0195114Structural inference embedded adversarial networks for scene parsing.ZeYu WangYanXia WuShuHui BuPengCheng HanGuoYin ZhangExplicit structural inference is one key point to improve the accuracy of scene parsing. Meanwhile, adversarial training method is able to reinforce spatial contiguity in output segmentations. To take both advantages of the structural learning and adversarial training simultaneously, we propose a novel deep learning network architecture called Structural Inference Embedded Adversarial Networks (SIEANs) for pixel-wise scene labeling. The generator of our SIEANs, a novel designed scene parsing network, makes full use of convolutional neural networks and long short-term memory networks to learn the global contextual information of objects in four different directions from RGB-(D) images, which is able to describe the (three-dimensional) spatial distributions of objects in a more comprehensive and accurate way. To further improve the performance, we explore the adversarial training method to optimize the generator along with a discriminator, which can not only detect and correct higher-order inconsistencies between the predicted segmentations and corresponding ground truths, but also exploit full advantages of the generator by fine-tuning its parameters so as to obtain higher consistencies. The experimental results demonstrate that our proposed SIEANs is able to achieve a better performance on PASCAL VOC 2012, SIFT FLOW, PASCAL Person-Part, Cityscapes, Stanford Background, NYUDv2, and SUN-RGBD datasets compared to the most of state-of-the-art methods.http://europepmc.org/articles/PMC5896926?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
ZeYu Wang YanXia Wu ShuHui Bu PengCheng Han GuoYin Zhang |
spellingShingle |
ZeYu Wang YanXia Wu ShuHui Bu PengCheng Han GuoYin Zhang Structural inference embedded adversarial networks for scene parsing. PLoS ONE |
author_facet |
ZeYu Wang YanXia Wu ShuHui Bu PengCheng Han GuoYin Zhang |
author_sort |
ZeYu Wang |
title |
Structural inference embedded adversarial networks for scene parsing. |
title_short |
Structural inference embedded adversarial networks for scene parsing. |
title_full |
Structural inference embedded adversarial networks for scene parsing. |
title_fullStr |
Structural inference embedded adversarial networks for scene parsing. |
title_full_unstemmed |
Structural inference embedded adversarial networks for scene parsing. |
title_sort |
structural inference embedded adversarial networks for scene parsing. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2018-01-01 |
description |
Explicit structural inference is one key point to improve the accuracy of scene parsing. Meanwhile, adversarial training method is able to reinforce spatial contiguity in output segmentations. To take both advantages of the structural learning and adversarial training simultaneously, we propose a novel deep learning network architecture called Structural Inference Embedded Adversarial Networks (SIEANs) for pixel-wise scene labeling. The generator of our SIEANs, a novel designed scene parsing network, makes full use of convolutional neural networks and long short-term memory networks to learn the global contextual information of objects in four different directions from RGB-(D) images, which is able to describe the (three-dimensional) spatial distributions of objects in a more comprehensive and accurate way. To further improve the performance, we explore the adversarial training method to optimize the generator along with a discriminator, which can not only detect and correct higher-order inconsistencies between the predicted segmentations and corresponding ground truths, but also exploit full advantages of the generator by fine-tuning its parameters so as to obtain higher consistencies. The experimental results demonstrate that our proposed SIEANs is able to achieve a better performance on PASCAL VOC 2012, SIFT FLOW, PASCAL Person-Part, Cityscapes, Stanford Background, NYUDv2, and SUN-RGBD datasets compared to the most of state-of-the-art methods. |
url |
http://europepmc.org/articles/PMC5896926?pdf=render |
work_keys_str_mv |
AT zeyuwang structuralinferenceembeddedadversarialnetworksforsceneparsing AT yanxiawu structuralinferenceembeddedadversarialnetworksforsceneparsing AT shuhuibu structuralinferenceembeddedadversarialnetworksforsceneparsing AT pengchenghan structuralinferenceembeddedadversarialnetworksforsceneparsing AT guoyinzhang structuralinferenceembeddedadversarialnetworksforsceneparsing |
_version_ |
1725804276282818560 |