A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform

博士 === 國立清華大學 === 電機工程學系 === 94 === In this dissertation, we present two new techniques for single channel speech enhancement. These techniques reduce the noise in each subband based on the critical-band-wavelet-packet decomposition. A noise masking threshold (NMT) is employed to adjust wavelet-coef...

Full description

Bibliographic Details
Main Authors:	Ching-Ta Lu, 陸清達
Other Authors:	Hsiao-Chuan Wang
Format:	Others
Language:	zh-TW
Published:	2006
Online Access:	http://ndltd.ncl.edu.tw/handle/83794703030013125893

id	ndltd-TW-094NTHU5442012
record_format	oai_dc
spelling	ndltd-TW-094NTHU54420122016-06-03T04:13:57Z http://ndltd.ncl.edu.tw/handle/83794703030013125893 A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform 使用臨界頻帶之封裝式小波轉換於單通道語音增強之研究 Ching-Ta Lu 陸清達博士國立清華大學電機工程學系 94 In this dissertation, we present two new techniques for single channel speech enhancement. These techniques reduce the noise in each subband based on the critical-band-wavelet-packet decomposition. A noise masking threshold (NMT) is employed to adjust wavelet-coefficient threshold or gain function for a subband. The first approach is to convert a noisy signal into wavelet coefficients (WCs), and subtract a threshold from noisy WCs in each subband. The threshold of each subband is adapted according to the segmental SNR and the NMT. Thus the residual noise can be efficiently suppressed. In the noise-dominated frame, the background noise can be almost removed. In the speech-dominated subbands, the speech distortion can be reduced by decreasing the wavelet-coefficient threshold. Experimental results show that the background noise is reduced and that the residual noise is less structured than a system without using masking properties, while the level of speech distortion remains acceptable. The second approach proposes a gain factor in each wavelet subband subject to a perceptual constraint. This perceptual constraint preserves the WCs of noisy speech when the level of residual noise is smaller than the NMT. A speech enhancement algorithm adapted with the NMT can cope with noisy speech corrupted by various types of colored noise. The performance of enhanced speech is characterized by a tradeoff between the amount of speech distortion and the level of musical residual noise. If the level of residual noise is smaller than the NMT, the human ear cannot perceive the corrupting noise. In this situation the gain factor is set to unity. Conversely, if the level of residual noise exceeds the NMT, then the gain factor tends to be smaller, and the corrupting noise is suppressed. Since the noise level is usually overestimated at low SNR, it leads to an underestimate of the gain factor. This results in more speech distortion and a muffled sound. Therefore, we propose a lower bound on the gain factor to prevent the noise level from being over-attenuated. The lower bound on gain factor is obtained by keeping the speech distortion smaller than the residual noise. Accordingly, the corresponding lower bound on the NMT is also obtained. This lower bound on the gain factor must be adapted to the noise level to reduce the speech distortion and minimize the musical residual noise. Experimental results show that the enhanced speech sounds more natural, and the musical residual noise is almost inaudible. Hsiao-Chuan Wang 王小川 2006 學位論文 ; thesis 102 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	博士 === 國立清華大學 === 電機工程學系 === 94 === In this dissertation, we present two new techniques for single channel speech enhancement. These techniques reduce the noise in each subband based on the critical-band-wavelet-packet decomposition. A noise masking threshold (NMT) is employed to adjust wavelet-coefficient threshold or gain function for a subband. The first approach is to convert a noisy signal into wavelet coefficients (WCs), and subtract a threshold from noisy WCs in each subband. The threshold of each subband is adapted according to the segmental SNR and the NMT. Thus the residual noise can be efficiently suppressed. In the noise-dominated frame, the background noise can be almost removed. In the speech-dominated subbands, the speech distortion can be reduced by decreasing the wavelet-coefficient threshold. Experimental results show that the background noise is reduced and that the residual noise is less structured than a system without using masking properties, while the level of speech distortion remains acceptable. The second approach proposes a gain factor in each wavelet subband subject to a perceptual constraint. This perceptual constraint preserves the WCs of noisy speech when the level of residual noise is smaller than the NMT. A speech enhancement algorithm adapted with the NMT can cope with noisy speech corrupted by various types of colored noise. The performance of enhanced speech is characterized by a tradeoff between the amount of speech distortion and the level of musical residual noise. If the level of residual noise is smaller than the NMT, the human ear cannot perceive the corrupting noise. In this situation the gain factor is set to unity. Conversely, if the level of residual noise exceeds the NMT, then the gain factor tends to be smaller, and the corrupting noise is suppressed. Since the noise level is usually overestimated at low SNR, it leads to an underestimate of the gain factor. This results in more speech distortion and a muffled sound. Therefore, we propose a lower bound on the gain factor to prevent the noise level from being over-attenuated. The lower bound on gain factor is obtained by keeping the speech distortion smaller than the residual noise. Accordingly, the corresponding lower bound on the NMT is also obtained. This lower bound on the gain factor must be adapted to the noise level to reduce the speech distortion and minimize the musical residual noise. Experimental results show that the enhanced speech sounds more natural, and the musical residual noise is almost inaudible.
author2	Hsiao-Chuan Wang
author_facet	Hsiao-Chuan Wang Ching-Ta Lu 陸清達
author	Ching-Ta Lu 陸清達
spellingShingle	Ching-Ta Lu 陸清達 A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform
author_sort	Ching-Ta Lu
title	A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform
title_short	A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform
title_full	A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform
title_fullStr	A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform
title_full_unstemmed	A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform
title_sort	study on single channel speech enhancement using critical-band-wavelet-packet transform
publishDate	2006
url	http://ndltd.ncl.edu.tw/handle/83794703030013125893
work_keys_str_mv	AT chingtalu astudyonsinglechannelspeechenhancementusingcriticalbandwaveletpackettransform AT lùqīngdá astudyonsinglechannelspeechenhancementusingcriticalbandwaveletpackettransform AT chingtalu shǐyònglínjièpíndàizhīfēngzhuāngshìxiǎobōzhuǎnhuànyúdāntōngdàoyǔyīnzēngqiángzhīyánjiū AT lùqīngdá shǐyònglínjièpíndàizhīfēngzhuāngshìxiǎobōzhuǎnhuànyúdāntōngdàoyǔyīnzēngqiángzhīyánjiū AT chingtalu studyonsinglechannelspeechenhancementusingcriticalbandwaveletpackettransform AT lùqīngdá studyonsinglechannelspeechenhancementusingcriticalbandwaveletpackettransform
_version_	1718293229020905472

A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform

Similar Items