Exploring Modulation Spectrum Normalization for Robust Speech Recognition
碩士 === 國立臺灣師範大學 === 資訊工程研究所 === 99 === The environmental mismatch caused by additive noise and/or channel distortion often degrades the performance of a speech recognition system seriously. Therefore, various robustness methods have been proposed, and one prevalent school of thought aims to refine t...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2011
|
Online Access: | http://ndltd.ncl.edu.tw/handle/55915765644588733638 |
id |
ndltd-TW-099NTNU5392056 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-099NTNU53920562015-10-19T04:05:07Z http://ndltd.ncl.edu.tw/handle/55915765644588733638 Exploring Modulation Spectrum Normalization for Robust Speech Recognition 調變頻譜特徵正規化於強健語音辨識 之研究 Wen-Yi Chu 朱紋儀 碩士 國立臺灣師範大學 資訊工程研究所 99 The environmental mismatch caused by additive noise and/or channel distortion often degrades the performance of a speech recognition system seriously. Therefore, various robustness methods have been proposed, and one prevalent school of thought aims to refine the modulation spectra of speech feature sequences. In this thesis, we proposed two novel methods to normalize the modulation spectra of speech feature sequences. First, we leverage nonnegative matrix factorization (NMF) to extract a common set of basis spectral vectors that discover the intrinsic temporal structure inherent in the modulation spectra of clean training speech features. The new modulation spectra of the speech features, constructed by mapping the original modulation spectra into the space spanned by these basis vectors, are demonstrated with good noise-robust capabilities. Second, to the render modulation spectra of speech feature sequences with a probabilistic perspective, we employ probabilistic latent semantic analysis (PLSA) with a latent set of topic distributions to explore the relationship between each modulation frequency and the magnitude modulation spectrum as a whole. All experiments were carried out on the Aurora-2 database and task. Experimental results show that the updated features via NMF and PLSA maintain high recognition accuracy for matched mismatched noisy conditions, which is quite competitive when compared to those obtained by other existing methods. Berlin Chen 陳柏琳 2011 學位論文 ; thesis 69 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣師範大學 === 資訊工程研究所 === 99 === The environmental mismatch caused by additive noise and/or channel distortion often degrades the performance of a speech recognition system seriously. Therefore, various robustness methods have been proposed, and one prevalent school of thought aims to refine the modulation spectra of speech feature sequences. In this thesis, we proposed two novel methods to normalize the modulation spectra of speech feature sequences. First, we leverage nonnegative matrix factorization (NMF) to extract a common set of basis spectral vectors that discover the intrinsic temporal structure inherent in the modulation spectra of clean training speech features. The new modulation spectra of the speech features, constructed by mapping the original modulation spectra into the space spanned by these basis vectors, are demonstrated with good noise-robust capabilities. Second, to the render modulation spectra of speech feature sequences with a probabilistic perspective, we employ probabilistic latent semantic analysis (PLSA) with a latent set of topic distributions to explore the relationship between each modulation frequency and the magnitude modulation spectrum as a whole. All experiments were carried out on the Aurora-2 database and task. Experimental results show that the updated features via NMF and PLSA maintain high recognition accuracy for matched mismatched noisy conditions, which is quite competitive when compared to those obtained by other existing methods.
|
author2 |
Berlin Chen |
author_facet |
Berlin Chen Wen-Yi Chu 朱紋儀 |
author |
Wen-Yi Chu 朱紋儀 |
spellingShingle |
Wen-Yi Chu 朱紋儀 Exploring Modulation Spectrum Normalization for Robust Speech Recognition |
author_sort |
Wen-Yi Chu |
title |
Exploring Modulation Spectrum Normalization for Robust Speech Recognition |
title_short |
Exploring Modulation Spectrum Normalization for Robust Speech Recognition |
title_full |
Exploring Modulation Spectrum Normalization for Robust Speech Recognition |
title_fullStr |
Exploring Modulation Spectrum Normalization for Robust Speech Recognition |
title_full_unstemmed |
Exploring Modulation Spectrum Normalization for Robust Speech Recognition |
title_sort |
exploring modulation spectrum normalization for robust speech recognition |
publishDate |
2011 |
url |
http://ndltd.ncl.edu.tw/handle/55915765644588733638 |
work_keys_str_mv |
AT wenyichu exploringmodulationspectrumnormalizationforrobustspeechrecognition AT zhūwényí exploringmodulationspectrumnormalizationforrobustspeechrecognition AT wenyichu diàobiànpínpǔtèzhēngzhèngguīhuàyúqiángjiànyǔyīnbiànshízhīyánjiū AT zhūwényí diàobiànpínpǔtèzhēngzhèngguīhuàyúqiángjiànyǔyīnbiànshízhīyánjiū |
_version_ |
1718095396188717056 |