The Study of Sub-band Feature Statistics Compensation Techniques Based on a Discrete Wavelet Transform for Robust Speech Recognition

碩士 === 國立暨南國際大學 === 電機工程學系 === 97 === The environmental mismatch caused by additive noise and/or channel distortion often degrades the performance of a speech recognition system seriously. Various robustness techniques have been proposed to reduce this mismatch, and one category of them aims to norm...

Full description

Bibliographic Details
Main Authors:	Hao-teng Fan, 范顥騰
Other Authors:	Jeih-weih Hung
Format:	Others
Language:	zh-TW
Published:	2009
Online Access:	http://ndltd.ncl.edu.tw/handle/24264693292357990593

id	ndltd-TW-097NCNU0442014
record_format	oai_dc
spelling	ndltd-TW-097NCNU04420142016-05-06T04:11:48Z http://ndltd.ncl.edu.tw/handle/24264693292357990593 The Study of Sub-band Feature Statistics Compensation Techniques Based on a Discrete Wavelet Transform for Robust Speech Recognition 強健性語音辨識中基於小波轉換之分頻統計補償技術的研究 Hao-teng Fan 范顥騰碩士國立暨南國際大學電機工程學系 97 The environmental mismatch caused by additive noise and/or channel distortion often degrades the performance of a speech recognition system seriously. Various robustness techniques have been proposed to reduce this mismatch, and one category of them aims to normalize the statistics of speech features in both training and testing conditions. In general, these statistics normalization methods deal with the speech feature sequences in a full-band manner, which somewhat ignores the fact that different modulation frequency components have unequal importance for speech recognition. With the above observations, in this paper we propose that the speech feature streams be processed in a sub-band manner. The processed temporal-domain feature sequence is first decomposed into non-uniform sub-bands using discrete wavelet transform (DWT), and then each sub-band stream is individually processed by the well-known normalization methods, like mean and variance normalization (MVN) and histogram equalization (HEQ). Finally, we reconstruct the feature stream with all the modified sub-band streams using inverse DWT. With this process, the components that correspond to more important modulation spectral bands in the feature sequence can be processed separately. For the Aurora-2 clean-condition training task, the new proposed sub-band MVN and HEQ provide relative error rate reductions of 20.32% and 16.39% over the conventional MVN and HEQ, respectively. These results reveal that the proposed methods significantly enhance the robustness of speech features in noise-corrupted environments. Jeih-weih Hung 洪志偉 2009 學位論文 ; thesis 50 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立暨南國際大學 === 電機工程學系 === 97 === The environmental mismatch caused by additive noise and/or channel distortion often degrades the performance of a speech recognition system seriously. Various robustness techniques have been proposed to reduce this mismatch, and one category of them aims to normalize the statistics of speech features in both training and testing conditions. In general, these statistics normalization methods deal with the speech feature sequences in a full-band manner, which somewhat ignores the fact that different modulation frequency components have unequal importance for speech recognition. With the above observations, in this paper we propose that the speech feature streams be processed in a sub-band manner. The processed temporal-domain feature sequence is first decomposed into non-uniform sub-bands using discrete wavelet transform (DWT), and then each sub-band stream is individually processed by the well-known normalization methods, like mean and variance normalization (MVN) and histogram equalization (HEQ). Finally, we reconstruct the feature stream with all the modified sub-band streams using inverse DWT. With this process, the components that correspond to more important modulation spectral bands in the feature sequence can be processed separately. For the Aurora-2 clean-condition training task, the new proposed sub-band MVN and HEQ provide relative error rate reductions of 20.32% and 16.39% over the conventional MVN and HEQ, respectively. These results reveal that the proposed methods significantly enhance the robustness of speech features in noise-corrupted environments.
author2	Jeih-weih Hung
author_facet	Jeih-weih Hung Hao-teng Fan 范顥騰
author	Hao-teng Fan 范顥騰
spellingShingle	Hao-teng Fan 范顥騰 The Study of Sub-band Feature Statistics Compensation Techniques Based on a Discrete Wavelet Transform for Robust Speech Recognition
author_sort	Hao-teng Fan
title	The Study of Sub-band Feature Statistics Compensation Techniques Based on a Discrete Wavelet Transform for Robust Speech Recognition
title_short	The Study of Sub-band Feature Statistics Compensation Techniques Based on a Discrete Wavelet Transform for Robust Speech Recognition
title_full	The Study of Sub-band Feature Statistics Compensation Techniques Based on a Discrete Wavelet Transform for Robust Speech Recognition
title_fullStr	The Study of Sub-band Feature Statistics Compensation Techniques Based on a Discrete Wavelet Transform for Robust Speech Recognition
title_full_unstemmed	The Study of Sub-band Feature Statistics Compensation Techniques Based on a Discrete Wavelet Transform for Robust Speech Recognition
title_sort	study of sub-band feature statistics compensation techniques based on a discrete wavelet transform for robust speech recognition
publishDate	2009
url	http://ndltd.ncl.edu.tw/handle/24264693292357990593
work_keys_str_mv	AT haotengfan thestudyofsubbandfeaturestatisticscompensationtechniquesbasedonadiscretewavelettransformforrobustspeechrecognition AT fànhàoténg thestudyofsubbandfeaturestatisticscompensationtechniquesbasedonadiscretewavelettransformforrobustspeechrecognition AT haotengfan qiángjiànxìngyǔyīnbiànshízhōngjīyúxiǎobōzhuǎnhuànzhīfēnpíntǒngjìbǔchángjìshùdeyánjiū AT fànhàoténg qiángjiànxìngyǔyīnbiànshízhōngjīyúxiǎobōzhuǎnhuànzhīfēnpíntǒngjìbǔchángjìshùdeyánjiū AT haotengfan studyofsubbandfeaturestatisticscompensationtechniquesbasedonadiscretewavelettransformforrobustspeechrecognition AT fànhàoténg studyofsubbandfeaturestatisticscompensationtechniquesbasedonadiscretewavelettransformforrobustspeechrecognition
_version_	1718261242497335296

The Study of Sub-band Feature Statistics Compensation Techniques Based on a Discrete Wavelet Transform for Robust Speech Recognition

Similar Items