Summary: | 碩士 === 國立清華大學 === 資訊工程學系 === 91 === Today, mobile phones and personal digital devices are made to be smaller and with more functions in them. In this way, the traditional typing input scheme becomes inconvenient to use. Voice input should be a good resolution.
In this thesis, we try to develop a speaker-independent Mandarin digits speech recognition system based on discrete HMM with simple models, low computation, and a high recognition rate. We use several techniques to improve the accuracy, including feature extraction, feature vector quantization, classifier combination, corrective training. In feature extraction part, we use nonuniform frame shifting (NUFS) to increase the weight of beginning parts. In feature vector quantization part, we use separate codebook for each digit. The distance between feature vector and codebook center can be a very good classifier of digit recognition. Furthermore, combining the distance with log probability of DHMM can also increase the accuracy.
We also applied corrective training which can correct the model that is classified incorrectly or has a log probability close to the target one. Additionally, we use segmental probability model (SPM) to reduce the computation time.
|