Design of Temporal Filters Based on Modulation Spectrum for Robust Speech Recognition
碩士 === 國立暨南國際大學 === 電機工程學系 === 94 === The computer and its related products have become a necessity in the modern life,and some of their common features are that they are often small in size,light in weight,and even invisible。As a result,the traditional man-machine interfaces,such as keyboard and mo...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2006
|
Online Access: | http://ndltd.ncl.edu.tw/handle/36126300610782932816 |
Summary: | 碩士 === 國立暨南國際大學 === 電機工程學系 === 94 === The computer and its related products have become a necessity in the modern life,and some of their common features are that they are often small in size,light in weight,and even invisible。As a result,the traditional man-machine interfaces,such as keyboard and mouse,are not convenient any longer。On the other hand,voice can be a very natural and efficient tool for people to communicate with these new equipments,with well-developed speech recognition techniques,it is no longer a dream for us to “talk” with machines。
However,the performance of a speech recognizer is often limited by its application environment。For example,the background noise and the channel effect often degrades the recognition accuracy very seriously。In the past,tremendous approaches by researchers have been proposed to enhance the recognizer’s performance under an adverse environment。In this thesis,we focus on developing new temporal filtering techniques for speech features in order to improve their robustness in noisy speech recognition。
The new proposed temporal filters are based on the statistical information of the modulation spectrum for speech features。They are derived according to constrainted versions of Principal Component Analysis(PCA)、Linear Discriminant Analysis(LDA)and Maximum Class Distance(MCD),respectively。
The result of a series of experiments conducted on Aurora 2.0 database show that the proposed temporal filters effectively enhance the recognition performance under noisy environments and they can be integrated with other temporal filtering approaches,Cepstral Mean and Variance Normalization(CMVN) and Cepstral Gain Normalization(CGN),to provide further improvements。
|
---|