DualPathGAN: Facial reenacted emotion synthesis
Abstract Facial reenactment has developed rapidly in recent years, but few methods have been built upon reenacted face in videos. Facial‐reenacted emotion synthesis can make the process of facial reenactment more practical. A facial‐reenacted emotion synthesis method is proposed that includes a dual...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2021-10-01
|
Series: | IET Computer Vision |
Online Access: | https://doi.org/10.1049/cvi2.12047 |
id |
doaj-0b0ef4d16fb044009a38846aa83685cb |
---|---|
record_format |
Article |
spelling |
doaj-0b0ef4d16fb044009a38846aa83685cb2021-09-08T18:21:14ZengWileyIET Computer Vision1751-96321751-96402021-10-0115750151310.1049/cvi2.12047DualPathGAN: Facial reenacted emotion synthesisJiahui Kong0Haibin Shen1Kejie Huang2College of Information Science & Electrical Engineering Zhejiang University Hang Zhou ChinaCollege of Information Science & Electrical Engineering Zhejiang University Hang Zhou ChinaCollege of Information Science & Electrical Engineering Zhejiang University Hang Zhou ChinaAbstract Facial reenactment has developed rapidly in recent years, but few methods have been built upon reenacted face in videos. Facial‐reenacted emotion synthesis can make the process of facial reenactment more practical. A facial‐reenacted emotion synthesis method is proposed that includes a dual‐path generative adversarial network (GAN) for emotion synthesis and a residual‐mask network to impose structural restrictions to preserve the mouth shape of the source person. To train the dual‐path GAN more effectively, a learning strategy based on separated discriminators is proposed. The method is trained and tested on a very challenging imbalanced dataset to evaluate the ability to deal with complex practical scenarios. Compared with general emotion synthesis methods, the proposed method can generate more realistic facial emotion synthesised images or videos with higher quality while retaining the expression contents of the original videos. The DualPathGAN achieves a Fréchet inception distance (FID) score of 9.20, which is lower than the FID score of 11.37 achieved with state‐of‐the‐art methods.https://doi.org/10.1049/cvi2.12047 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Jiahui Kong Haibin Shen Kejie Huang |
spellingShingle |
Jiahui Kong Haibin Shen Kejie Huang DualPathGAN: Facial reenacted emotion synthesis IET Computer Vision |
author_facet |
Jiahui Kong Haibin Shen Kejie Huang |
author_sort |
Jiahui Kong |
title |
DualPathGAN: Facial reenacted emotion synthesis |
title_short |
DualPathGAN: Facial reenacted emotion synthesis |
title_full |
DualPathGAN: Facial reenacted emotion synthesis |
title_fullStr |
DualPathGAN: Facial reenacted emotion synthesis |
title_full_unstemmed |
DualPathGAN: Facial reenacted emotion synthesis |
title_sort |
dualpathgan: facial reenacted emotion synthesis |
publisher |
Wiley |
series |
IET Computer Vision |
issn |
1751-9632 1751-9640 |
publishDate |
2021-10-01 |
description |
Abstract Facial reenactment has developed rapidly in recent years, but few methods have been built upon reenacted face in videos. Facial‐reenacted emotion synthesis can make the process of facial reenactment more practical. A facial‐reenacted emotion synthesis method is proposed that includes a dual‐path generative adversarial network (GAN) for emotion synthesis and a residual‐mask network to impose structural restrictions to preserve the mouth shape of the source person. To train the dual‐path GAN more effectively, a learning strategy based on separated discriminators is proposed. The method is trained and tested on a very challenging imbalanced dataset to evaluate the ability to deal with complex practical scenarios. Compared with general emotion synthesis methods, the proposed method can generate more realistic facial emotion synthesised images or videos with higher quality while retaining the expression contents of the original videos. The DualPathGAN achieves a Fréchet inception distance (FID) score of 9.20, which is lower than the FID score of 11.37 achieved with state‐of‐the‐art methods. |
url |
https://doi.org/10.1049/cvi2.12047 |
work_keys_str_mv |
AT jiahuikong dualpathganfacialreenactedemotionsynthesis AT haibinshen dualpathganfacialreenactedemotionsynthesis AT kejiehuang dualpathganfacialreenactedemotionsynthesis |
_version_ |
1717761898341990400 |