A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform

博士 === 國立清華大學 === 電機工程學系 === 94 === In this dissertation, we present two new techniques for single channel speech enhancement. These techniques reduce the noise in each subband based on the critical-band-wavelet-packet decomposition. A noise masking threshold (NMT) is employed to adjust wavelet-coef...

Full description

Bibliographic Details
Main Authors: Ching-Ta Lu, 陸清達
Other Authors: Hsiao-Chuan Wang
Format: Others
Language:zh-TW
Published: 2006
Online Access:http://ndltd.ncl.edu.tw/handle/83794703030013125893
id ndltd-TW-094NTHU5442012
record_format oai_dc
spelling ndltd-TW-094NTHU54420122016-06-03T04:13:57Z http://ndltd.ncl.edu.tw/handle/83794703030013125893 A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform 使用臨界頻帶之封裝式小波轉換於單通道語音增強之研究 Ching-Ta Lu 陸清達 博士 國立清華大學 電機工程學系 94 In this dissertation, we present two new techniques for single channel speech enhancement. These techniques reduce the noise in each subband based on the critical-band-wavelet-packet decomposition. A noise masking threshold (NMT) is employed to adjust wavelet-coefficient threshold or gain function for a subband. The first approach is to convert a noisy signal into wavelet coefficients (WCs), and subtract a threshold from noisy WCs in each subband. The threshold of each subband is adapted according to the segmental SNR and the NMT. Thus the residual noise can be efficiently suppressed. In the noise-dominated frame, the background noise can be almost removed. In the speech-dominated subbands, the speech distortion can be reduced by decreasing the wavelet-coefficient threshold. Experimental results show that the background noise is reduced and that the residual noise is less structured than a system without using masking properties, while the level of speech distortion remains acceptable. The second approach proposes a gain factor in each wavelet subband subject to a perceptual constraint. This perceptual constraint preserves the WCs of noisy speech when the level of residual noise is smaller than the NMT. A speech enhancement algorithm adapted with the NMT can cope with noisy speech corrupted by various types of colored noise. The performance of enhanced speech is characterized by a tradeoff between the amount of speech distortion and the level of musical residual noise. If the level of residual noise is smaller than the NMT, the human ear cannot perceive the corrupting noise. In this situation the gain factor is set to unity. Conversely, if the level of residual noise exceeds the NMT, then the gain factor tends to be smaller, and the corrupting noise is suppressed. Since the noise level is usually overestimated at low SNR, it leads to an underestimate of the gain factor. This results in more speech distortion and a muffled sound. Therefore, we propose a lower bound on the gain factor to prevent the noise level from being over-attenuated. The lower bound on gain factor is obtained by keeping the speech distortion smaller than the residual noise. Accordingly, the corresponding lower bound on the NMT is also obtained. This lower bound on the gain factor must be adapted to the noise level to reduce the speech distortion and minimize the musical residual noise. Experimental results show that the enhanced speech sounds more natural, and the musical residual noise is almost inaudible. Hsiao-Chuan Wang 王小川 2006 學位論文 ; thesis 102 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 博士 === 國立清華大學 === 電機工程學系 === 94 === In this dissertation, we present two new techniques for single channel speech enhancement. These techniques reduce the noise in each subband based on the critical-band-wavelet-packet decomposition. A noise masking threshold (NMT) is employed to adjust wavelet-coefficient threshold or gain function for a subband. The first approach is to convert a noisy signal into wavelet coefficients (WCs), and subtract a threshold from noisy WCs in each subband. The threshold of each subband is adapted according to the segmental SNR and the NMT. Thus the residual noise can be efficiently suppressed. In the noise-dominated frame, the background noise can be almost removed. In the speech-dominated subbands, the speech distortion can be reduced by decreasing the wavelet-coefficient threshold. Experimental results show that the background noise is reduced and that the residual noise is less structured than a system without using masking properties, while the level of speech distortion remains acceptable. The second approach proposes a gain factor in each wavelet subband subject to a perceptual constraint. This perceptual constraint preserves the WCs of noisy speech when the level of residual noise is smaller than the NMT. A speech enhancement algorithm adapted with the NMT can cope with noisy speech corrupted by various types of colored noise. The performance of enhanced speech is characterized by a tradeoff between the amount of speech distortion and the level of musical residual noise. If the level of residual noise is smaller than the NMT, the human ear cannot perceive the corrupting noise. In this situation the gain factor is set to unity. Conversely, if the level of residual noise exceeds the NMT, then the gain factor tends to be smaller, and the corrupting noise is suppressed. Since the noise level is usually overestimated at low SNR, it leads to an underestimate of the gain factor. This results in more speech distortion and a muffled sound. Therefore, we propose a lower bound on the gain factor to prevent the noise level from being over-attenuated. The lower bound on gain factor is obtained by keeping the speech distortion smaller than the residual noise. Accordingly, the corresponding lower bound on the NMT is also obtained. This lower bound on the gain factor must be adapted to the noise level to reduce the speech distortion and minimize the musical residual noise. Experimental results show that the enhanced speech sounds more natural, and the musical residual noise is almost inaudible.
author2 Hsiao-Chuan Wang
author_facet Hsiao-Chuan Wang
Ching-Ta Lu
陸清達
author Ching-Ta Lu
陸清達
spellingShingle Ching-Ta Lu
陸清達
A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform
author_sort Ching-Ta Lu
title A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform
title_short A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform
title_full A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform
title_fullStr A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform
title_full_unstemmed A Study on Single Channel Speech Enhancement Using Critical-Band-Wavelet-Packet Transform
title_sort study on single channel speech enhancement using critical-band-wavelet-packet transform
publishDate 2006
url http://ndltd.ncl.edu.tw/handle/83794703030013125893
work_keys_str_mv AT chingtalu astudyonsinglechannelspeechenhancementusingcriticalbandwaveletpackettransform
AT lùqīngdá astudyonsinglechannelspeechenhancementusingcriticalbandwaveletpackettransform
AT chingtalu shǐyònglínjièpíndàizhīfēngzhuāngshìxiǎobōzhuǎnhuànyúdāntōngdàoyǔyīnzēngqiángzhīyánjiū
AT lùqīngdá shǐyònglínjièpíndàizhīfēngzhuāngshìxiǎobōzhuǎnhuànyúdāntōngdàoyǔyīnzēngqiángzhīyánjiū
AT chingtalu studyonsinglechannelspeechenhancementusingcriticalbandwaveletpackettransform
AT lùqīngdá studyonsinglechannelspeechenhancementusingcriticalbandwaveletpackettransform
_version_ 1718293229020905472