Cross-View Image Translation Based on Local and Global Information Guidance
The cross-view image translation task is aimed at generating scene images from arbitrary views. However, due to the great differences in the shapes and contents of the various views, the quality of the generated images is degraded. Small objects, such as vehicles' shapes and details, are not cl...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9328105/ |
id |
doaj-e445fa53129c4076b0c93832a2c526df |
---|---|
record_format |
Article |
spelling |
doaj-e445fa53129c4076b0c93832a2c526df2021-04-05T17:36:28ZengIEEEIEEE Access2169-35362021-01-019129551296710.1109/ACCESS.2021.30522419328105Cross-View Image Translation Based on Local and Global Information GuidanceYan Shen0https://orcid.org/0000-0001-9287-1206Meng Luo1https://orcid.org/0000-0003-1294-2890Yun Chen2Xiaotao Shao3https://orcid.org/0000-0003-0758-518XZhongli Wang4https://orcid.org/0000-0002-3236-8219Xiaoli Hao5Ya-Li Hou6https://orcid.org/0000-0002-0518-5935School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, ChinaSchool of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, ChinaShanghai Institute of Spaceflight Control Technology, Shanghai, ChinaSchool of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, ChinaSchool of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, ChinaSchool of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, ChinaSchool of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, ChinaThe cross-view image translation task is aimed at generating scene images from arbitrary views. However, due to the great differences in the shapes and contents of the various views, the quality of the generated images is degraded. Small objects, such as vehicles' shapes and details, are not clear, which causes them to be structurally inconsistent with the semantic map used to guide the generation process. To solve this problem, we propose a novel generative adversarial network based on a local and global information processing module (LAGGAN) to recover the image's details and structures. The network will further combine the input viewpoint image and the target semantic segmentation map to guide the generation of the target image from another viewpoint. The proposed LAGGAN includes a two-stage generator and a parameter-sharing discriminator. LAGGAN uses a new local and global information processing module (LAG) to generate high-quality images from various views. Moreover, we integrate dilated convolutions into the discriminator to capture the global context, which can enhance the discriminative ability and further adjust the LAG module. Therefore, most semantic information can be preserved, and the details of the target viewpoint images can be translated more sharply. Quantitative and qualitative evaluation on both CVUSA and Dayton datasets attest to the fact that our method, LAGGAN, presents satisfactory perceptual results and is comparable to state-of-the-art methods on the cross-view image translation task.https://ieeexplore.ieee.org/document/9328105/Aerial imagescross-view image translationgenerative adversarial networks (GANs)ground-level imageslocal and global information processing module |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Yan Shen Meng Luo Yun Chen Xiaotao Shao Zhongli Wang Xiaoli Hao Ya-Li Hou |
spellingShingle |
Yan Shen Meng Luo Yun Chen Xiaotao Shao Zhongli Wang Xiaoli Hao Ya-Li Hou Cross-View Image Translation Based on Local and Global Information Guidance IEEE Access Aerial images cross-view image translation generative adversarial networks (GANs) ground-level images local and global information processing module |
author_facet |
Yan Shen Meng Luo Yun Chen Xiaotao Shao Zhongli Wang Xiaoli Hao Ya-Li Hou |
author_sort |
Yan Shen |
title |
Cross-View Image Translation Based on Local and Global Information Guidance |
title_short |
Cross-View Image Translation Based on Local and Global Information Guidance |
title_full |
Cross-View Image Translation Based on Local and Global Information Guidance |
title_fullStr |
Cross-View Image Translation Based on Local and Global Information Guidance |
title_full_unstemmed |
Cross-View Image Translation Based on Local and Global Information Guidance |
title_sort |
cross-view image translation based on local and global information guidance |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2021-01-01 |
description |
The cross-view image translation task is aimed at generating scene images from arbitrary views. However, due to the great differences in the shapes and contents of the various views, the quality of the generated images is degraded. Small objects, such as vehicles' shapes and details, are not clear, which causes them to be structurally inconsistent with the semantic map used to guide the generation process. To solve this problem, we propose a novel generative adversarial network based on a local and global information processing module (LAGGAN) to recover the image's details and structures. The network will further combine the input viewpoint image and the target semantic segmentation map to guide the generation of the target image from another viewpoint. The proposed LAGGAN includes a two-stage generator and a parameter-sharing discriminator. LAGGAN uses a new local and global information processing module (LAG) to generate high-quality images from various views. Moreover, we integrate dilated convolutions into the discriminator to capture the global context, which can enhance the discriminative ability and further adjust the LAG module. Therefore, most semantic information can be preserved, and the details of the target viewpoint images can be translated more sharply. Quantitative and qualitative evaluation on both CVUSA and Dayton datasets attest to the fact that our method, LAGGAN, presents satisfactory perceptual results and is comparable to state-of-the-art methods on the cross-view image translation task. |
topic |
Aerial images cross-view image translation generative adversarial networks (GANs) ground-level images local and global information processing module |
url |
https://ieeexplore.ieee.org/document/9328105/ |
work_keys_str_mv |
AT yanshen crossviewimagetranslationbasedonlocalandglobalinformationguidance AT mengluo crossviewimagetranslationbasedonlocalandglobalinformationguidance AT yunchen crossviewimagetranslationbasedonlocalandglobalinformationguidance AT xiaotaoshao crossviewimagetranslationbasedonlocalandglobalinformationguidance AT zhongliwang crossviewimagetranslationbasedonlocalandglobalinformationguidance AT xiaolihao crossviewimagetranslationbasedonlocalandglobalinformationguidance AT yalihou crossviewimagetranslationbasedonlocalandglobalinformationguidance |
_version_ |
1721539357954801664 |