Summary: | 碩士 === 國立交通大學 === 電信工程系所 === 92 === In this thesis, the effect of feature transformation in speaker-adapted speech recognition is exploited. Two criteria, minimum mean-squared error and maximum likelihood, are employed to formulate the feature transformation algorithm. Besides, the approach of using different transformation for three broad speech classes of initial, final, and silence is also studied. Effectiveness of the proposed method was examined by simulations using MAT4500 telephone speech database with 9/10 data for training and 1/10 for testing. Sentential utterances were used in the speaker adaptation test. The amount of adaptation date ranged from one utterance (4 seconds) to eight utterances (37 seconds). Experimental results showed that the proposed feature transformation method can eliminate the speaker/channel effect so as to make the HMM models more compact. We also found that, as more transformation parameters were used, the upper bound of recognition rate was better while the adaptation effect became worse for small adaptation data. This mainly resulted from the inaccuracy of parameter estimation when insufficient adaptation data were used.
|