Summary: | 碩士 === 國立臺灣科技大學 === 資訊工程系 === 91 === In this thesis, a speaker adaptation method is developed. This method needs only a small quantity of training utterances because the adaptation mechanism is operated on the level of MFCC feature parameter. First, an individual coordinate system is built for each new speaker in order that his MFCC feature vectors can be decomposed into coordinate coefficients of the system. Then, the coordinate coefficients are directly mapped as coefficients of the coordinate system for a target person. Even though this mechanism is simple, it can indeed obtain good adaptation performance. To verify the performance of our adaptation method, we have executed several recognition experiments under different conditions. The conditions are for different kinds of vocabularies, including sing-vowel vocabulary, multi-vowel vocabulary, nasal-containing syllable vocabulary and dissyllabic word vocabulary. In speaker non-adapted mode, the original recognition error rates are 30.3%, 20.7%, 38.3% and 21.3% respectively. However, in speaker adapted mode, the error rates are reduced to 3.3%, 9.8%, 22.5% and 12.3% respectively.
|