Summary: | 碩士 === 國立臺灣科技大學 === 資訊工程系 === 95 === In this thesis, we study to build a large-vocabulary Hakka spoken word recognition system. Some of the system modules are directly took from the HMM modules in HTK. When the grammar tool in HTK is used to do large vocabulary word recognition, the recognition speed of the system is quite slow. Therefore, we propose and study a two-stage based recognition method. With this method, the recognition speed is largely improved and near to real-time although the recognition rate is slightly decreased. In addition, we have found the best values for the model parameters according to experiments executed with the Hakka word recognition system. That is, right-context dependent initial unit and final unit are best segmentation units for HMM acoustic modeling. Besides, the best number of states is 4, the best number of mixtures is 11, and the best number of syllable candidates for word construction in the second recognition stage is 13. Under this setting of parameter values, the highest recognition rate obtained is 90.0%, and the time spent to recognize a word utterance is 0.95 second in average.
|