Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method
This paper presents a new deep neural network (DNN)-based speech enhancement algorithm by integrating the distilled knowledge from the traditional statistical-based method. Unlike the other DNN-based methods, which usually train many different models on the same data and then average their predictio...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2019-08-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/9/16/3396 |
id |
doaj-2c7c12c0bafe4fdf869714c7125585cb |
---|---|
record_format |
Article |
spelling |
doaj-2c7c12c0bafe4fdf869714c7125585cb2020-11-25T00:37:47ZengMDPI AGApplied Sciences2076-34172019-08-01916339610.3390/app9163396app9163396Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical MethodJianfeng Wu0Yongzhu Hua1Shengying Yang2Hongshuai Qin3Huibin Qin4The Institute of Electron Device & Application, Hangzhou Dianzi University, Hangzhou 310018, ChinaThe Institute of Electron Device & Application, Hangzhou Dianzi University, Hangzhou 310018, ChinaThe Institute of Electron Device & Application, Hangzhou Dianzi University, Hangzhou 310018, ChinaThe Institute of Electron Device & Application, Hangzhou Dianzi University, Hangzhou 310018, ChinaThe Institute of Electron Device & Application, Hangzhou Dianzi University, Hangzhou 310018, ChinaThis paper presents a new deep neural network (DNN)-based speech enhancement algorithm by integrating the distilled knowledge from the traditional statistical-based method. Unlike the other DNN-based methods, which usually train many different models on the same data and then average their predictions, or use a large number of noise types to enlarge the simulated noisy speech, the proposed method does not train a whole ensemble of models and does not require a mass of simulated noisy speech. It first trains a discriminator network and a generator network simultaneously using the adversarial learning method. Then, the discriminator network and generator network are re-trained by distilling knowledge from the statistical method, which is inspired by the knowledge distillation in a neural network. Finally, the generator network is fine-tuned using real noisy speech. Experiments on CHiME4 data sets demonstrate that the proposed method achieves a more robust performance than the compared DNN-based method in terms of perceptual speech quality.https://www.mdpi.com/2076-3417/9/16/3396speech enhancementdeep neural networkgenerative adversarial networkdistill knowledge |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Jianfeng Wu Yongzhu Hua Shengying Yang Hongshuai Qin Huibin Qin |
spellingShingle |
Jianfeng Wu Yongzhu Hua Shengying Yang Hongshuai Qin Huibin Qin Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method Applied Sciences speech enhancement deep neural network generative adversarial network distill knowledge |
author_facet |
Jianfeng Wu Yongzhu Hua Shengying Yang Hongshuai Qin Huibin Qin |
author_sort |
Jianfeng Wu |
title |
Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method |
title_short |
Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method |
title_full |
Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method |
title_fullStr |
Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method |
title_full_unstemmed |
Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method |
title_sort |
speech enhancement using generative adversarial network by distilling knowledge from statistical method |
publisher |
MDPI AG |
series |
Applied Sciences |
issn |
2076-3417 |
publishDate |
2019-08-01 |
description |
This paper presents a new deep neural network (DNN)-based speech enhancement algorithm by integrating the distilled knowledge from the traditional statistical-based method. Unlike the other DNN-based methods, which usually train many different models on the same data and then average their predictions, or use a large number of noise types to enlarge the simulated noisy speech, the proposed method does not train a whole ensemble of models and does not require a mass of simulated noisy speech. It first trains a discriminator network and a generator network simultaneously using the adversarial learning method. Then, the discriminator network and generator network are re-trained by distilling knowledge from the statistical method, which is inspired by the knowledge distillation in a neural network. Finally, the generator network is fine-tuned using real noisy speech. Experiments on CHiME4 data sets demonstrate that the proposed method achieves a more robust performance than the compared DNN-based method in terms of perceptual speech quality. |
topic |
speech enhancement deep neural network generative adversarial network distill knowledge |
url |
https://www.mdpi.com/2076-3417/9/16/3396 |
work_keys_str_mv |
AT jianfengwu speechenhancementusinggenerativeadversarialnetworkbydistillingknowledgefromstatisticalmethod AT yongzhuhua speechenhancementusinggenerativeadversarialnetworkbydistillingknowledgefromstatisticalmethod AT shengyingyang speechenhancementusinggenerativeadversarialnetworkbydistillingknowledgefromstatisticalmethod AT hongshuaiqin speechenhancementusinggenerativeadversarialnetworkbydistillingknowledgefromstatisticalmethod AT huibinqin speechenhancementusinggenerativeadversarialnetworkbydistillingknowledgefromstatisticalmethod |
_version_ |
1725299552238436352 |