Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition

博士 === 國立暨南國際大學 === 電機工程學系 === 100 === The performance of an automatic speech recognition (ASR) system is often degraded due to the various types of noise and interference in the application environment. In this disseration, we aim to develop robustness methods specifically for handling additive...

Full description

Bibliographic Details
Main Authors: Wen-Hsiang Tu, 杜文祥
Other Authors: Jeih-weih Hung
Format: Others
Language:en_US
Published: 2012
Online Access:http://ndltd.ncl.edu.tw/handle/66756772463135462510
id ndltd-TW-100NCNU0442106
record_format oai_dc
spelling ndltd-TW-100NCNU04421062015-10-13T21:07:20Z http://ndltd.ncl.edu.tw/handle/66756772463135462510 Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition 強化各域之語音特徵於雜訊強健性語音辨識之研究 Wen-Hsiang Tu 杜文祥 博士 國立暨南國際大學 電機工程學系 100 The performance of an automatic speech recognition (ASR) system is often degraded due to the various types of noise and interference in the application environment. In this disseration, we aim to develop robustness methods specifically for handling additive noise and channel disturbance. In particular, these developed methods are used to refine the mel-frequency cepstral coefficient (MFCC), which is one of the most widely used speech feature representation in ASR. At first, we discuss the effect of noise in the linear spectral domain of MFCC, and then present the approach of magnitude spectrum enhancement (MSE) to refine the spectrum of speech signals. Next, the method of hybrid cepstral statistics normalization is presented to process the MFCC in the mel-spectral domain. Finally, two novel compensation algorithms, modulation spectrum replacement (MSR) and modulation spectrum filtering (MSF), are provided to enhance the MFCC in the cepstral domain. The recognition experiments conducted on the Aurora-2 connected-digit database show that the aforementioned novel methods are capable of improving the recognition accuracy of the MFCC in various noise conditions, and in most cases they perform better than, or at least similarly to, the state-of-the-art noise robustness techniques such as Wiener filtering (WF), spectral subtraction (SS), mean and variance normalization (MVN) and histogram equalization (HEQ). Jeih-weih Hung 洪志偉 2012 學位論文 ; thesis 81 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立暨南國際大學 === 電機工程學系 === 100 === The performance of an automatic speech recognition (ASR) system is often degraded due to the various types of noise and interference in the application environment. In this disseration, we aim to develop robustness methods specifically for handling additive noise and channel disturbance. In particular, these developed methods are used to refine the mel-frequency cepstral coefficient (MFCC), which is one of the most widely used speech feature representation in ASR. At first, we discuss the effect of noise in the linear spectral domain of MFCC, and then present the approach of magnitude spectrum enhancement (MSE) to refine the spectrum of speech signals. Next, the method of hybrid cepstral statistics normalization is presented to process the MFCC in the mel-spectral domain. Finally, two novel compensation algorithms, modulation spectrum replacement (MSR) and modulation spectrum filtering (MSF), are provided to enhance the MFCC in the cepstral domain. The recognition experiments conducted on the Aurora-2 connected-digit database show that the aforementioned novel methods are capable of improving the recognition accuracy of the MFCC in various noise conditions, and in most cases they perform better than, or at least similarly to, the state-of-the-art noise robustness techniques such as Wiener filtering (WF), spectral subtraction (SS), mean and variance normalization (MVN) and histogram equalization (HEQ).
author2 Jeih-weih Hung
author_facet Jeih-weih Hung
Wen-Hsiang Tu
杜文祥
author Wen-Hsiang Tu
杜文祥
spellingShingle Wen-Hsiang Tu
杜文祥
Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition
author_sort Wen-Hsiang Tu
title Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition
title_short Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition
title_full Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition
title_fullStr Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition
title_full_unstemmed Enhancing Speech Features in Various Domains for Noise-Robust Speech Recognition
title_sort enhancing speech features in various domains for noise-robust speech recognition
publishDate 2012
url http://ndltd.ncl.edu.tw/handle/66756772463135462510
work_keys_str_mv AT wenhsiangtu enhancingspeechfeaturesinvariousdomainsfornoiserobustspeechrecognition
AT dùwénxiáng enhancingspeechfeaturesinvariousdomainsfornoiserobustspeechrecognition
AT wenhsiangtu qiánghuàgèyùzhīyǔyīntèzhēngyúzáxùnqiángjiànxìngyǔyīnbiànshízhīyánjiū
AT dùwénxiáng qiánghuàgèyùzhīyǔyīntèzhēngyúzáxùnqiángjiànxìngyǔyīnbiànshízhīyánjiū
_version_ 1718056146102648832