Summary: | 碩士 === 國立暨南國際大學 === 電機工程學系 === 95 === This thesis proposed a new robust speech recognition technique in noisy environment. The feature extraction bases on MFCC (Mel-frequency cepstral coefficients), and template matching employs Hidden Markov Models (HMM). Since the performance of speech recognition can be improved by using temporal filters, we focus on the optimization of these filters. Hence, we adopt genetic algorithms (GA) to dynamically select the proper temporal filters in order to obtain the robust MFCC. For Mel-scale banks, there are totally 20 triangular banks. Hence, there are 20 corresponding temporal filters which are encoded into the chromosome. We use 10 chromosomes in the genetic population. Finally, we do the experiment, it adopt Chinese digit (0-9) words form 20 speakers. Everyone speaks 10 times. One half people speak as reference data, other as test data. The recognition rate can attain 44.5% in 0db SNR.
|