DualPathGAN: Facial reenacted emotion synthesis

Abstract Facial reenactment has developed rapidly in recent years, but few methods have been built upon reenacted face in videos. Facial‐reenacted emotion synthesis can make the process of facial reenactment more practical. A facial‐reenacted emotion synthesis method is proposed that includes a dual...

Full description

Bibliographic Details
Main Authors: Jiahui Kong, Haibin Shen, Kejie Huang
Format: Article
Language:English
Published: Wiley 2021-10-01
Series:IET Computer Vision
Online Access:https://doi.org/10.1049/cvi2.12047
id doaj-0b0ef4d16fb044009a38846aa83685cb
record_format Article
spelling doaj-0b0ef4d16fb044009a38846aa83685cb2021-09-08T18:21:14ZengWileyIET Computer Vision1751-96321751-96402021-10-0115750151310.1049/cvi2.12047DualPathGAN: Facial reenacted emotion synthesisJiahui Kong0Haibin Shen1Kejie Huang2College of Information Science & Electrical Engineering Zhejiang University Hang Zhou ChinaCollege of Information Science & Electrical Engineering Zhejiang University Hang Zhou ChinaCollege of Information Science & Electrical Engineering Zhejiang University Hang Zhou ChinaAbstract Facial reenactment has developed rapidly in recent years, but few methods have been built upon reenacted face in videos. Facial‐reenacted emotion synthesis can make the process of facial reenactment more practical. A facial‐reenacted emotion synthesis method is proposed that includes a dual‐path generative adversarial network (GAN) for emotion synthesis and a residual‐mask network to impose structural restrictions to preserve the mouth shape of the source person. To train the dual‐path GAN more effectively, a learning strategy based on separated discriminators is proposed. The method is trained and tested on a very challenging imbalanced dataset to evaluate the ability to deal with complex practical scenarios. Compared with general emotion synthesis methods, the proposed method can generate more realistic facial emotion synthesised images or videos with higher quality while retaining the expression contents of the original videos. The DualPathGAN achieves a Fréchet inception distance (FID) score of 9.20, which is lower than the FID score of 11.37 achieved with state‐of‐the‐art methods.https://doi.org/10.1049/cvi2.12047
collection DOAJ
language English
format Article
sources DOAJ
author Jiahui Kong
Haibin Shen
Kejie Huang
spellingShingle Jiahui Kong
Haibin Shen
Kejie Huang
DualPathGAN: Facial reenacted emotion synthesis
IET Computer Vision
author_facet Jiahui Kong
Haibin Shen
Kejie Huang
author_sort Jiahui Kong
title DualPathGAN: Facial reenacted emotion synthesis
title_short DualPathGAN: Facial reenacted emotion synthesis
title_full DualPathGAN: Facial reenacted emotion synthesis
title_fullStr DualPathGAN: Facial reenacted emotion synthesis
title_full_unstemmed DualPathGAN: Facial reenacted emotion synthesis
title_sort dualpathgan: facial reenacted emotion synthesis
publisher Wiley
series IET Computer Vision
issn 1751-9632
1751-9640
publishDate 2021-10-01
description Abstract Facial reenactment has developed rapidly in recent years, but few methods have been built upon reenacted face in videos. Facial‐reenacted emotion synthesis can make the process of facial reenactment more practical. A facial‐reenacted emotion synthesis method is proposed that includes a dual‐path generative adversarial network (GAN) for emotion synthesis and a residual‐mask network to impose structural restrictions to preserve the mouth shape of the source person. To train the dual‐path GAN more effectively, a learning strategy based on separated discriminators is proposed. The method is trained and tested on a very challenging imbalanced dataset to evaluate the ability to deal with complex practical scenarios. Compared with general emotion synthesis methods, the proposed method can generate more realistic facial emotion synthesised images or videos with higher quality while retaining the expression contents of the original videos. The DualPathGAN achieves a Fréchet inception distance (FID) score of 9.20, which is lower than the FID score of 11.37 achieved with state‐of‐the‐art methods.
url https://doi.org/10.1049/cvi2.12047
work_keys_str_mv AT jiahuikong dualpathganfacialreenactedemotionsynthesis
AT haibinshen dualpathganfacialreenactedemotionsynthesis
AT kejiehuang dualpathganfacialreenactedemotionsynthesis
_version_ 1717761898341990400