Structural inference embedded adversarial networks for scene parsing.

Explicit structural inference is one key point to improve the accuracy of scene parsing. Meanwhile, adversarial training method is able to reinforce spatial contiguity in output segmentations. To take both advantages of the structural learning and adversarial training simultaneously, we propose a no...

Full description

Bibliographic Details
Main Authors: ZeYu Wang, YanXia Wu, ShuHui Bu, PengCheng Han, GuoYin Zhang
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2018-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC5896926?pdf=render
id doaj-479e892ee9ec444b9b3a8e4323fae34b
record_format Article
spelling doaj-479e892ee9ec444b9b3a8e4323fae34b2020-11-24T22:11:46ZengPublic Library of Science (PLoS)PLoS ONE1932-62032018-01-01134e019511410.1371/journal.pone.0195114Structural inference embedded adversarial networks for scene parsing.ZeYu WangYanXia WuShuHui BuPengCheng HanGuoYin ZhangExplicit structural inference is one key point to improve the accuracy of scene parsing. Meanwhile, adversarial training method is able to reinforce spatial contiguity in output segmentations. To take both advantages of the structural learning and adversarial training simultaneously, we propose a novel deep learning network architecture called Structural Inference Embedded Adversarial Networks (SIEANs) for pixel-wise scene labeling. The generator of our SIEANs, a novel designed scene parsing network, makes full use of convolutional neural networks and long short-term memory networks to learn the global contextual information of objects in four different directions from RGB-(D) images, which is able to describe the (three-dimensional) spatial distributions of objects in a more comprehensive and accurate way. To further improve the performance, we explore the adversarial training method to optimize the generator along with a discriminator, which can not only detect and correct higher-order inconsistencies between the predicted segmentations and corresponding ground truths, but also exploit full advantages of the generator by fine-tuning its parameters so as to obtain higher consistencies. The experimental results demonstrate that our proposed SIEANs is able to achieve a better performance on PASCAL VOC 2012, SIFT FLOW, PASCAL Person-Part, Cityscapes, Stanford Background, NYUDv2, and SUN-RGBD datasets compared to the most of state-of-the-art methods.http://europepmc.org/articles/PMC5896926?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author ZeYu Wang
YanXia Wu
ShuHui Bu
PengCheng Han
GuoYin Zhang
spellingShingle ZeYu Wang
YanXia Wu
ShuHui Bu
PengCheng Han
GuoYin Zhang
Structural inference embedded adversarial networks for scene parsing.
PLoS ONE
author_facet ZeYu Wang
YanXia Wu
ShuHui Bu
PengCheng Han
GuoYin Zhang
author_sort ZeYu Wang
title Structural inference embedded adversarial networks for scene parsing.
title_short Structural inference embedded adversarial networks for scene parsing.
title_full Structural inference embedded adversarial networks for scene parsing.
title_fullStr Structural inference embedded adversarial networks for scene parsing.
title_full_unstemmed Structural inference embedded adversarial networks for scene parsing.
title_sort structural inference embedded adversarial networks for scene parsing.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2018-01-01
description Explicit structural inference is one key point to improve the accuracy of scene parsing. Meanwhile, adversarial training method is able to reinforce spatial contiguity in output segmentations. To take both advantages of the structural learning and adversarial training simultaneously, we propose a novel deep learning network architecture called Structural Inference Embedded Adversarial Networks (SIEANs) for pixel-wise scene labeling. The generator of our SIEANs, a novel designed scene parsing network, makes full use of convolutional neural networks and long short-term memory networks to learn the global contextual information of objects in four different directions from RGB-(D) images, which is able to describe the (three-dimensional) spatial distributions of objects in a more comprehensive and accurate way. To further improve the performance, we explore the adversarial training method to optimize the generator along with a discriminator, which can not only detect and correct higher-order inconsistencies between the predicted segmentations and corresponding ground truths, but also exploit full advantages of the generator by fine-tuning its parameters so as to obtain higher consistencies. The experimental results demonstrate that our proposed SIEANs is able to achieve a better performance on PASCAL VOC 2012, SIFT FLOW, PASCAL Person-Part, Cityscapes, Stanford Background, NYUDv2, and SUN-RGBD datasets compared to the most of state-of-the-art methods.
url http://europepmc.org/articles/PMC5896926?pdf=render
work_keys_str_mv AT zeyuwang structuralinferenceembeddedadversarialnetworksforsceneparsing
AT yanxiawu structuralinferenceembeddedadversarialnetworksforsceneparsing
AT shuhuibu structuralinferenceembeddedadversarialnetworksforsceneparsing
AT pengchenghan structuralinferenceembeddedadversarialnetworksforsceneparsing
AT guoyinzhang structuralinferenceembeddedadversarialnetworksforsceneparsing
_version_ 1725804276282818560