Cross-View Image Translation Based on Local and Global Information Guidance

The cross-view image translation task is aimed at generating scene images from arbitrary views. However, due to the great differences in the shapes and contents of the various views, the quality of the generated images is degraded. Small objects, such as vehicles' shapes and details, are not cl...

Full description

Bibliographic Details
Main Authors: Yan Shen, Meng Luo, Yun Chen, Xiaotao Shao, Zhongli Wang, Xiaoli Hao, Ya-Li Hou
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9328105/
id doaj-e445fa53129c4076b0c93832a2c526df
record_format Article
spelling doaj-e445fa53129c4076b0c93832a2c526df2021-04-05T17:36:28ZengIEEEIEEE Access2169-35362021-01-019129551296710.1109/ACCESS.2021.30522419328105Cross-View Image Translation Based on Local and Global Information GuidanceYan Shen0https://orcid.org/0000-0001-9287-1206Meng Luo1https://orcid.org/0000-0003-1294-2890Yun Chen2Xiaotao Shao3https://orcid.org/0000-0003-0758-518XZhongli Wang4https://orcid.org/0000-0002-3236-8219Xiaoli Hao5Ya-Li Hou6https://orcid.org/0000-0002-0518-5935School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, ChinaSchool of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, ChinaShanghai Institute of Spaceflight Control Technology, Shanghai, ChinaSchool of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, ChinaSchool of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, ChinaSchool of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, ChinaSchool of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, ChinaThe cross-view image translation task is aimed at generating scene images from arbitrary views. However, due to the great differences in the shapes and contents of the various views, the quality of the generated images is degraded. Small objects, such as vehicles' shapes and details, are not clear, which causes them to be structurally inconsistent with the semantic map used to guide the generation process. To solve this problem, we propose a novel generative adversarial network based on a local and global information processing module (LAGGAN) to recover the image's details and structures. The network will further combine the input viewpoint image and the target semantic segmentation map to guide the generation of the target image from another viewpoint. The proposed LAGGAN includes a two-stage generator and a parameter-sharing discriminator. LAGGAN uses a new local and global information processing module (LAG) to generate high-quality images from various views. Moreover, we integrate dilated convolutions into the discriminator to capture the global context, which can enhance the discriminative ability and further adjust the LAG module. Therefore, most semantic information can be preserved, and the details of the target viewpoint images can be translated more sharply. Quantitative and qualitative evaluation on both CVUSA and Dayton datasets attest to the fact that our method, LAGGAN, presents satisfactory perceptual results and is comparable to state-of-the-art methods on the cross-view image translation task.https://ieeexplore.ieee.org/document/9328105/Aerial imagescross-view image translationgenerative adversarial networks (GANs)ground-level imageslocal and global information processing module
collection DOAJ
language English
format Article
sources DOAJ
author Yan Shen
Meng Luo
Yun Chen
Xiaotao Shao
Zhongli Wang
Xiaoli Hao
Ya-Li Hou
spellingShingle Yan Shen
Meng Luo
Yun Chen
Xiaotao Shao
Zhongli Wang
Xiaoli Hao
Ya-Li Hou
Cross-View Image Translation Based on Local and Global Information Guidance
IEEE Access
Aerial images
cross-view image translation
generative adversarial networks (GANs)
ground-level images
local and global information processing module
author_facet Yan Shen
Meng Luo
Yun Chen
Xiaotao Shao
Zhongli Wang
Xiaoli Hao
Ya-Li Hou
author_sort Yan Shen
title Cross-View Image Translation Based on Local and Global Information Guidance
title_short Cross-View Image Translation Based on Local and Global Information Guidance
title_full Cross-View Image Translation Based on Local and Global Information Guidance
title_fullStr Cross-View Image Translation Based on Local and Global Information Guidance
title_full_unstemmed Cross-View Image Translation Based on Local and Global Information Guidance
title_sort cross-view image translation based on local and global information guidance
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2021-01-01
description The cross-view image translation task is aimed at generating scene images from arbitrary views. However, due to the great differences in the shapes and contents of the various views, the quality of the generated images is degraded. Small objects, such as vehicles' shapes and details, are not clear, which causes them to be structurally inconsistent with the semantic map used to guide the generation process. To solve this problem, we propose a novel generative adversarial network based on a local and global information processing module (LAGGAN) to recover the image's details and structures. The network will further combine the input viewpoint image and the target semantic segmentation map to guide the generation of the target image from another viewpoint. The proposed LAGGAN includes a two-stage generator and a parameter-sharing discriminator. LAGGAN uses a new local and global information processing module (LAG) to generate high-quality images from various views. Moreover, we integrate dilated convolutions into the discriminator to capture the global context, which can enhance the discriminative ability and further adjust the LAG module. Therefore, most semantic information can be preserved, and the details of the target viewpoint images can be translated more sharply. Quantitative and qualitative evaluation on both CVUSA and Dayton datasets attest to the fact that our method, LAGGAN, presents satisfactory perceptual results and is comparable to state-of-the-art methods on the cross-view image translation task.
topic Aerial images
cross-view image translation
generative adversarial networks (GANs)
ground-level images
local and global information processing module
url https://ieeexplore.ieee.org/document/9328105/
work_keys_str_mv AT yanshen crossviewimagetranslationbasedonlocalandglobalinformationguidance
AT mengluo crossviewimagetranslationbasedonlocalandglobalinformationguidance
AT yunchen crossviewimagetranslationbasedonlocalandglobalinformationguidance
AT xiaotaoshao crossviewimagetranslationbasedonlocalandglobalinformationguidance
AT zhongliwang crossviewimagetranslationbasedonlocalandglobalinformationguidance
AT xiaolihao crossviewimagetranslationbasedonlocalandglobalinformationguidance
AT yalihou crossviewimagetranslationbasedonlocalandglobalinformationguidance
_version_ 1721539357954801664