Exploring Modulation Spectrum Normalization for Robust Speech Recognition

碩士 === 國立臺灣師範大學 === 資訊工程研究所 === 99 === The environmental mismatch caused by additive noise and/or channel distortion often degrades the performance of a speech recognition system seriously. Therefore, various robustness methods have been proposed, and one prevalent school of thought aims to refine t...

Full description

Bibliographic Details
Main Authors: Wen-Yi Chu, 朱紋儀
Other Authors: Berlin Chen
Format: Others
Language:zh-TW
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/55915765644588733638
id ndltd-TW-099NTNU5392056
record_format oai_dc
spelling ndltd-TW-099NTNU53920562015-10-19T04:05:07Z http://ndltd.ncl.edu.tw/handle/55915765644588733638 Exploring Modulation Spectrum Normalization for Robust Speech Recognition 調變頻譜特徵正規化於強健語音辨識 之研究 Wen-Yi Chu 朱紋儀 碩士 國立臺灣師範大學 資訊工程研究所 99 The environmental mismatch caused by additive noise and/or channel distortion often degrades the performance of a speech recognition system seriously. Therefore, various robustness methods have been proposed, and one prevalent school of thought aims to refine the modulation spectra of speech feature sequences. In this thesis, we proposed two novel methods to normalize the modulation spectra of speech feature sequences. First, we leverage nonnegative matrix factorization (NMF) to extract a common set of basis spectral vectors that discover the intrinsic temporal structure inherent in the modulation spectra of clean training speech features. The new modulation spectra of the speech features, constructed by mapping the original modulation spectra into the space spanned by these basis vectors, are demonstrated with good noise-robust capabilities. Second, to the render modulation spectra of speech feature sequences with a probabilistic perspective, we employ probabilistic latent semantic analysis (PLSA) with a latent set of topic distributions to explore the relationship between each modulation frequency and the magnitude modulation spectrum as a whole. All experiments were carried out on the Aurora-2 database and task. Experimental results show that the updated features via NMF and PLSA maintain high recognition accuracy for matched mismatched noisy conditions, which is quite competitive when compared to those obtained by other existing methods. Berlin Chen 陳柏琳 2011 學位論文 ; thesis 69 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺灣師範大學 === 資訊工程研究所 === 99 === The environmental mismatch caused by additive noise and/or channel distortion often degrades the performance of a speech recognition system seriously. Therefore, various robustness methods have been proposed, and one prevalent school of thought aims to refine the modulation spectra of speech feature sequences. In this thesis, we proposed two novel methods to normalize the modulation spectra of speech feature sequences. First, we leverage nonnegative matrix factorization (NMF) to extract a common set of basis spectral vectors that discover the intrinsic temporal structure inherent in the modulation spectra of clean training speech features. The new modulation spectra of the speech features, constructed by mapping the original modulation spectra into the space spanned by these basis vectors, are demonstrated with good noise-robust capabilities. Second, to the render modulation spectra of speech feature sequences with a probabilistic perspective, we employ probabilistic latent semantic analysis (PLSA) with a latent set of topic distributions to explore the relationship between each modulation frequency and the magnitude modulation spectrum as a whole. All experiments were carried out on the Aurora-2 database and task. Experimental results show that the updated features via NMF and PLSA maintain high recognition accuracy for matched mismatched noisy conditions, which is quite competitive when compared to those obtained by other existing methods.
author2 Berlin Chen
author_facet Berlin Chen
Wen-Yi Chu
朱紋儀
author Wen-Yi Chu
朱紋儀
spellingShingle Wen-Yi Chu
朱紋儀
Exploring Modulation Spectrum Normalization for Robust Speech Recognition
author_sort Wen-Yi Chu
title Exploring Modulation Spectrum Normalization for Robust Speech Recognition
title_short Exploring Modulation Spectrum Normalization for Robust Speech Recognition
title_full Exploring Modulation Spectrum Normalization for Robust Speech Recognition
title_fullStr Exploring Modulation Spectrum Normalization for Robust Speech Recognition
title_full_unstemmed Exploring Modulation Spectrum Normalization for Robust Speech Recognition
title_sort exploring modulation spectrum normalization for robust speech recognition
publishDate 2011
url http://ndltd.ncl.edu.tw/handle/55915765644588733638
work_keys_str_mv AT wenyichu exploringmodulationspectrumnormalizationforrobustspeechrecognition
AT zhūwényí exploringmodulationspectrumnormalizationforrobustspeechrecognition
AT wenyichu diàobiànpínpǔtèzhēngzhèngguīhuàyúqiángjiànyǔyīnbiànshízhīyánjiū
AT zhūwényí diàobiànpínpǔtèzhēngzhèngguīhuàyúqiángjiànyǔyīnbiànshízhīyánjiū
_version_ 1718095396188717056