Supervised Video-to-Video Synthesis for Single Human Pose Transfer
In this paper, we focus on human pose transfer in different videos, i.e., transferring the dance pose of a person in given video to a target person in the other video. Our methods can be summed up in three stages to tackle this challenging scenario. Firstly, we extract the frames and pose masks from...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9333577/ |
id |
doaj-d13e6a8e35ce48c4bec04e88bed52437 |
---|---|
record_format |
Article |
spelling |
doaj-d13e6a8e35ce48c4bec04e88bed524372021-03-30T15:24:52ZengIEEEIEEE Access2169-35362021-01-019175441755610.1109/ACCESS.2021.30536179333577Supervised Video-to-Video Synthesis for Single Human Pose TransferHongyu Wang0https://orcid.org/0000-0003-0224-3156Mengxing Huang1https://orcid.org/0000-0002-5709-703XDi Wu2Yuchun Li3https://orcid.org/0000-0003-2723-220XWeichao Zhang4State Key Laboratory of Marine Resource Utilization in South China Sea, Hainan University, Haikou, ChinaState Key Laboratory of Marine Resource Utilization in South China Sea, Hainan University, Haikou, ChinaState Key Laboratory of Marine Resource Utilization in South China Sea, Hainan University, Haikou, ChinaState Key Laboratory of Marine Resource Utilization in South China Sea, Hainan University, Haikou, ChinaState Key Laboratory of Marine Resource Utilization in South China Sea, Hainan University, Haikou, ChinaIn this paper, we focus on human pose transfer in different videos, i.e., transferring the dance pose of a person in given video to a target person in the other video. Our methods can be summed up in three stages to tackle this challenging scenario. Firstly, we extract the frames and pose masks from the source video and target video. Secondly, we use our model to synthesize the frames of target person with the given dance pose. Thirdly, we refine the generated frames to improve the quality of outputs. Our model is built on three stages: 1) human pose extraction and normalization. 2) a GAN based on cross-domain correspondence mechanism to synthesize dance-guided person image in target video by consecutive frames and pose stick images. 3) coarse-to-fine generation strategy which includes two GANs: a GAN used to reconstruct human face in target video, the other generates smoothing frame sequences. Finally, we compress the sequential frames generated from our model into video format. Compared with previous works, our model manifests better person appearance consistency and time coherence in video-to-video synthesis for human motion transfer, which makes the generated video look more realistic. The qualitative and quantitative comparisons represent our approach performs significant improvements over the state-of-the-art methods. Experiments on synthetic frames and ground truth validate the effectiveness of the proposed method.https://ieeexplore.ieee.org/document/9333577/Generative adversarial network (GAN)image-to-image translationvideo-to-video synthesispose-guided person image generation |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Hongyu Wang Mengxing Huang Di Wu Yuchun Li Weichao Zhang |
spellingShingle |
Hongyu Wang Mengxing Huang Di Wu Yuchun Li Weichao Zhang Supervised Video-to-Video Synthesis for Single Human Pose Transfer IEEE Access Generative adversarial network (GAN) image-to-image translation video-to-video synthesis pose-guided person image generation |
author_facet |
Hongyu Wang Mengxing Huang Di Wu Yuchun Li Weichao Zhang |
author_sort |
Hongyu Wang |
title |
Supervised Video-to-Video Synthesis for Single Human Pose Transfer |
title_short |
Supervised Video-to-Video Synthesis for Single Human Pose Transfer |
title_full |
Supervised Video-to-Video Synthesis for Single Human Pose Transfer |
title_fullStr |
Supervised Video-to-Video Synthesis for Single Human Pose Transfer |
title_full_unstemmed |
Supervised Video-to-Video Synthesis for Single Human Pose Transfer |
title_sort |
supervised video-to-video synthesis for single human pose transfer |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2021-01-01 |
description |
In this paper, we focus on human pose transfer in different videos, i.e., transferring the dance pose of a person in given video to a target person in the other video. Our methods can be summed up in three stages to tackle this challenging scenario. Firstly, we extract the frames and pose masks from the source video and target video. Secondly, we use our model to synthesize the frames of target person with the given dance pose. Thirdly, we refine the generated frames to improve the quality of outputs. Our model is built on three stages: 1) human pose extraction and normalization. 2) a GAN based on cross-domain correspondence mechanism to synthesize dance-guided person image in target video by consecutive frames and pose stick images. 3) coarse-to-fine generation strategy which includes two GANs: a GAN used to reconstruct human face in target video, the other generates smoothing frame sequences. Finally, we compress the sequential frames generated from our model into video format. Compared with previous works, our model manifests better person appearance consistency and time coherence in video-to-video synthesis for human motion transfer, which makes the generated video look more realistic. The qualitative and quantitative comparisons represent our approach performs significant improvements over the state-of-the-art methods. Experiments on synthetic frames and ground truth validate the effectiveness of the proposed method. |
topic |
Generative adversarial network (GAN) image-to-image translation video-to-video synthesis pose-guided person image generation |
url |
https://ieeexplore.ieee.org/document/9333577/ |
work_keys_str_mv |
AT hongyuwang supervisedvideotovideosynthesisforsinglehumanposetransfer AT mengxinghuang supervisedvideotovideosynthesisforsinglehumanposetransfer AT diwu supervisedvideotovideosynthesisforsinglehumanposetransfer AT yuchunli supervisedvideotovideosynthesisforsinglehumanposetransfer AT weichaozhang supervisedvideotovideosynthesisforsinglehumanposetransfer |
_version_ |
1724179580267266048 |