Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method

This paper presents a new deep neural network (DNN)-based speech enhancement algorithm by integrating the distilled knowledge from the traditional statistical-based method. Unlike the other DNN-based methods, which usually train many different models on the same data and then average their predictio...

Full description

Bibliographic Details
Main Authors:	Jianfeng Wu, Yongzhu Hua, Shengying Yang, Hongshuai Qin, Huibin Qin
Format:	Article
Language:	English
Published:	MDPI AG 2019-08-01
Series:	Applied Sciences
Subjects:	speech enhancement deep neural network generative adversarial network distill knowledge
Online Access:	https://www.mdpi.com/2076-3417/9/16/3396

id	doaj-2c7c12c0bafe4fdf869714c7125585cb
record_format	Article
spelling	doaj-2c7c12c0bafe4fdf869714c7125585cb2020-11-25T00:37:47ZengMDPI AGApplied Sciences2076-34172019-08-01916339610.3390/app9163396app9163396Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical MethodJianfeng Wu0Yongzhu Hua1Shengying Yang2Hongshuai Qin3Huibin Qin4The Institute of Electron Device & Application, Hangzhou Dianzi University, Hangzhou 310018, ChinaThe Institute of Electron Device & Application, Hangzhou Dianzi University, Hangzhou 310018, ChinaThe Institute of Electron Device & Application, Hangzhou Dianzi University, Hangzhou 310018, ChinaThe Institute of Electron Device & Application, Hangzhou Dianzi University, Hangzhou 310018, ChinaThe Institute of Electron Device & Application, Hangzhou Dianzi University, Hangzhou 310018, ChinaThis paper presents a new deep neural network (DNN)-based speech enhancement algorithm by integrating the distilled knowledge from the traditional statistical-based method. Unlike the other DNN-based methods, which usually train many different models on the same data and then average their predictions, or use a large number of noise types to enlarge the simulated noisy speech, the proposed method does not train a whole ensemble of models and does not require a mass of simulated noisy speech. It first trains a discriminator network and a generator network simultaneously using the adversarial learning method. Then, the discriminator network and generator network are re-trained by distilling knowledge from the statistical method, which is inspired by the knowledge distillation in a neural network. Finally, the generator network is fine-tuned using real noisy speech. Experiments on CHiME4 data sets demonstrate that the proposed method achieves a more robust performance than the compared DNN-based method in terms of perceptual speech quality.https://www.mdpi.com/2076-3417/9/16/3396speech enhancementdeep neural networkgenerative adversarial networkdistill knowledge
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Jianfeng Wu Yongzhu Hua Shengying Yang Hongshuai Qin Huibin Qin
spellingShingle	Jianfeng Wu Yongzhu Hua Shengying Yang Hongshuai Qin Huibin Qin Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method Applied Sciences speech enhancement deep neural network generative adversarial network distill knowledge
author_facet	Jianfeng Wu Yongzhu Hua Shengying Yang Hongshuai Qin Huibin Qin
author_sort	Jianfeng Wu
title	Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method
title_short	Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method
title_full	Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method
title_fullStr	Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method
title_full_unstemmed	Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method
title_sort	speech enhancement using generative adversarial network by distilling knowledge from statistical method
publisher	MDPI AG
series	Applied Sciences
issn	2076-3417
publishDate	2019-08-01
description	This paper presents a new deep neural network (DNN)-based speech enhancement algorithm by integrating the distilled knowledge from the traditional statistical-based method. Unlike the other DNN-based methods, which usually train many different models on the same data and then average their predictions, or use a large number of noise types to enlarge the simulated noisy speech, the proposed method does not train a whole ensemble of models and does not require a mass of simulated noisy speech. It first trains a discriminator network and a generator network simultaneously using the adversarial learning method. Then, the discriminator network and generator network are re-trained by distilling knowledge from the statistical method, which is inspired by the knowledge distillation in a neural network. Finally, the generator network is fine-tuned using real noisy speech. Experiments on CHiME4 data sets demonstrate that the proposed method achieves a more robust performance than the compared DNN-based method in terms of perceptual speech quality.
topic	speech enhancement deep neural network generative adversarial network distill knowledge
url	https://www.mdpi.com/2076-3417/9/16/3396
work_keys_str_mv	AT jianfengwu speechenhancementusinggenerativeadversarialnetworkbydistillingknowledgefromstatisticalmethod AT yongzhuhua speechenhancementusinggenerativeadversarialnetworkbydistillingknowledgefromstatisticalmethod AT shengyingyang speechenhancementusinggenerativeadversarialnetworkbydistillingknowledgefromstatisticalmethod AT hongshuaiqin speechenhancementusinggenerativeadversarialnetworkbydistillingknowledgefromstatisticalmethod AT huibinqin speechenhancementusinggenerativeadversarialnetworkbydistillingknowledgefromstatisticalmethod
_version_	1725299552238436352

Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method

Similar Items