Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition

博士 === 國立暨南國際大學 === 電機工程學系 === 100 === The performance of an automatic speech recognition (ASR) system is often degraded due to the various types of noise and interference in the application environment. In this disseration, we aim to develop robustness methods specifically for handling additive...

Full description

Bibliographic Details
Main Authors:	Wen-Hsiang Tu, 杜文祥
Other Authors:	Jeih-weih Hung
Format:	Others
Language:	en_US
Published:	2012
Online Access:	http://ndltd.ncl.edu.tw/handle/66756772463135462510

id	ndltd-TW-100NCNU0442106
record_format	oai_dc
spelling	ndltd-TW-100NCNU04421062015-10-13T21:07:20Z http://ndltd.ncl.edu.tw/handle/66756772463135462510 Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition 強化各域之語音特徵於雜訊強健性語音辨識之研究 Wen-Hsiang Tu 杜文祥博士國立暨南國際大學電機工程學系 100 The performance of an automatic speech recognition (ASR) system is often degraded due to the various types of noise and interference in the application environment. In this disseration, we aim to develop robustness methods specifically for handling additive noise and channel disturbance. In particular, these developed methods are used to refine the mel-frequency cepstral coefficient (MFCC), which is one of the most widely used speech feature representation in ASR. At first, we discuss the effect of noise in the linear spectral domain of MFCC, and then present the approach of magnitude spectrum enhancement (MSE) to refine the spectrum of speech signals. Next, the method of hybrid cepstral statistics normalization is presented to process the MFCC in the mel-spectral domain. Finally, two novel compensation algorithms, modulation spectrum replacement (MSR) and modulation spectrum filtering (MSF), are provided to enhance the MFCC in the cepstral domain. The recognition experiments conducted on the Aurora-2 connected-digit database show that the aforementioned novel methods are capable of improving the recognition accuracy of the MFCC in various noise conditions, and in most cases they perform better than, or at least similarly to, the state-of-the-art noise robustness techniques such as Wiener filtering (WF), spectral subtraction (SS), mean and variance normalization (MVN) and histogram equalization (HEQ). Jeih-weih Hung 洪志偉 2012 學位論文 ; thesis 81 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	博士 === 國立暨南國際大學 === 電機工程學系 === 100 === The performance of an automatic speech recognition (ASR) system is often degraded due to the various types of noise and interference in the application environment. In this disseration, we aim to develop robustness methods specifically for handling additive noise and channel disturbance. In particular, these developed methods are used to refine the mel-frequency cepstral coefficient (MFCC), which is one of the most widely used speech feature representation in ASR. At first, we discuss the effect of noise in the linear spectral domain of MFCC, and then present the approach of magnitude spectrum enhancement (MSE) to refine the spectrum of speech signals. Next, the method of hybrid cepstral statistics normalization is presented to process the MFCC in the mel-spectral domain. Finally, two novel compensation algorithms, modulation spectrum replacement (MSR) and modulation spectrum filtering (MSF), are provided to enhance the MFCC in the cepstral domain. The recognition experiments conducted on the Aurora-2 connected-digit database show that the aforementioned novel methods are capable of improving the recognition accuracy of the MFCC in various noise conditions, and in most cases they perform better than, or at least similarly to, the state-of-the-art noise robustness techniques such as Wiener filtering (WF), spectral subtraction (SS), mean and variance normalization (MVN) and histogram equalization (HEQ).
author2	Jeih-weih Hung
author_facet	Jeih-weih Hung Wen-Hsiang Tu 杜文祥
author	Wen-Hsiang Tu 杜文祥
spellingShingle	Wen-Hsiang Tu 杜文祥 Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition
author_sort	Wen-Hsiang Tu
title	Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition
title_short	Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition
title_full	Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition
title_fullStr	Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition
title_full_unstemmed	Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition
title_sort	enhancing speech features in various domains for noise-robust speech recognition
publishDate	2012
url	http://ndltd.ncl.edu.tw/handle/66756772463135462510
work_keys_str_mv	AT wenhsiangtu enhancingspeechfeaturesinvariousdomainsfornoiserobustspeechrecognition AT dùwénxiáng enhancingspeechfeaturesinvariousdomainsfornoiserobustspeechrecognition AT wenhsiangtu qiánghuàgèyùzhīyǔyīntèzhēngyúzáxùnqiángjiànxìngyǔyīnbiànshízhīyánjiū AT dùwénxiáng qiánghuàgèyùzhīyǔyīntèzhēngyúzáxùnqiángjiànxìngyǔyīnbiànshízhīyánjiū
_version_	1718056146102648832

Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition

Similar Items