Low-complexity artificial noise suppression methods for deep learning-based speech enhancement algorithms

Abstract Deep learning-based speech enhancement algorithms have shown their powerful ability in removing both stationary and non-stationary noise components from noisy speech observations. But they often introduce artificial residual noise, especially when the training target does not contain the ph...

Full description

Bibliographic Details
Main Authors:	Yuxuan Ke, Andong Li, Chengshi Zheng, Renhua Peng, Xiaodong Li
Format:	Article
Language:	English
Published:	SpringerOpen 2021-04-01
Series:	EURASIP Journal on Audio, Speech, and Music Processing
Subjects:	Speech enhancement Artificial residual noise Postprocessing scheme
Online Access:	https://doi.org/10.1186/s13636-021-00204-9

id	doaj-34785bf925e84260b12685a9abf89f31
record_format	Article
spelling	doaj-34785bf925e84260b12685a9abf89f312021-04-18T11:24:18ZengSpringerOpenEURASIP Journal on Audio, Speech, and Music Processing1687-47222021-04-012021111510.1186/s13636-021-00204-9Low-complexity artificial noise suppression methods for deep learning-based speech enhancement algorithmsYuxuan Ke0Andong Li1Chengshi Zheng2Renhua Peng3Xiaodong Li4Key Laboratory of Noise and Vibration Research, Institute of Acoustics, Chinese Academy of SciencesKey Laboratory of Noise and Vibration Research, Institute of Acoustics, Chinese Academy of SciencesKey Laboratory of Noise and Vibration Research, Institute of Acoustics, Chinese Academy of SciencesKey Laboratory of Noise and Vibration Research, Institute of Acoustics, Chinese Academy of SciencesKey Laboratory of Noise and Vibration Research, Institute of Acoustics, Chinese Academy of SciencesAbstract Deep learning-based speech enhancement algorithms have shown their powerful ability in removing both stationary and non-stationary noise components from noisy speech observations. But they often introduce artificial residual noise, especially when the training target does not contain the phase information, e.g., ideal ratio mask, or the clean speech magnitude and its variations. It is well-known that once the power of the residual noise components exceeds the noise masking threshold of the human auditory system, the perceptual speech quality may degrade. One intuitive way is to further suppress the residual noise components by a postprocessing scheme. However, the highly non-stationary nature of this kind of residual noise makes the noise power spectral density (PSD) estimation a challenging problem. To solve this problem, the paper proposes three strategies to estimate the noise PSD frame by frame, and then the residual noise can be removed effectively by applying a gain function based on the decision-directed approach. The objective measurement results show that the proposed postfiltering strategies outperform the conventional postfilter in terms of segmental signal-to-noise ratio (SNR) as well as speech quality improvement. Moreover, the AB subjective listening test shows that the preference percentages of the proposed strategies are over 60%.https://doi.org/10.1186/s13636-021-00204-9Speech enhancementArtificial residual noisePostprocessing scheme
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Yuxuan Ke Andong Li Chengshi Zheng Renhua Peng Xiaodong Li
spellingShingle	Yuxuan Ke Andong Li Chengshi Zheng Renhua Peng Xiaodong Li Low-complexity artificial noise suppression methods for deep learning-based speech enhancement algorithms EURASIP Journal on Audio, Speech, and Music Processing Speech enhancement Artificial residual noise Postprocessing scheme
author_facet	Yuxuan Ke Andong Li Chengshi Zheng Renhua Peng Xiaodong Li
author_sort	Yuxuan Ke
title	Low-complexity artificial noise suppression methods for deep learning-based speech enhancement algorithms
title_short	Low-complexity artificial noise suppression methods for deep learning-based speech enhancement algorithms
title_full	Low-complexity artificial noise suppression methods for deep learning-based speech enhancement algorithms
title_fullStr	Low-complexity artificial noise suppression methods for deep learning-based speech enhancement algorithms
title_full_unstemmed	Low-complexity artificial noise suppression methods for deep learning-based speech enhancement algorithms
title_sort	low-complexity artificial noise suppression methods for deep learning-based speech enhancement algorithms
publisher	SpringerOpen
series	EURASIP Journal on Audio, Speech, and Music Processing
issn	1687-4722
publishDate	2021-04-01
description	Abstract Deep learning-based speech enhancement algorithms have shown their powerful ability in removing both stationary and non-stationary noise components from noisy speech observations. But they often introduce artificial residual noise, especially when the training target does not contain the phase information, e.g., ideal ratio mask, or the clean speech magnitude and its variations. It is well-known that once the power of the residual noise components exceeds the noise masking threshold of the human auditory system, the perceptual speech quality may degrade. One intuitive way is to further suppress the residual noise components by a postprocessing scheme. However, the highly non-stationary nature of this kind of residual noise makes the noise power spectral density (PSD) estimation a challenging problem. To solve this problem, the paper proposes three strategies to estimate the noise PSD frame by frame, and then the residual noise can be removed effectively by applying a gain function based on the decision-directed approach. The objective measurement results show that the proposed postfiltering strategies outperform the conventional postfilter in terms of segmental signal-to-noise ratio (SNR) as well as speech quality improvement. Moreover, the AB subjective listening test shows that the preference percentages of the proposed strategies are over 60%.
topic	Speech enhancement Artificial residual noise Postprocessing scheme
url	https://doi.org/10.1186/s13636-021-00204-9
work_keys_str_mv	AT yuxuanke lowcomplexityartificialnoisesuppressionmethodsfordeeplearningbasedspeechenhancementalgorithms AT andongli lowcomplexityartificialnoisesuppressionmethodsfordeeplearningbasedspeechenhancementalgorithms AT chengshizheng lowcomplexityartificialnoisesuppressionmethodsfordeeplearningbasedspeechenhancementalgorithms AT renhuapeng lowcomplexityartificialnoisesuppressionmethodsfordeeplearningbasedspeechenhancementalgorithms AT xiaodongli lowcomplexityartificialnoisesuppressionmethodsfordeeplearningbasedspeechenhancementalgorithms
_version_	1721522364605267968

Low-complexity artificial noise suppression methods for deep learning-based speech enhancement algorithms

Similar Items