System design of a WFST-based Mandarin speech recognizer

碩士 === 國立交通大學 === 電信工程研究所 === 102 === This thesis is mainly focus on improving language model in Automatic Speech Recognition(ASR). The studies normalize the training data including combining synonym, variant word, multi-pronunciation. The words are categorized by word class to choose dictionary. Ra...

Full description

Bibliographic Details
Main Authors: Su, Chung-Ming, 蘇仲銘
Other Authors: Wang, Yih-Ru
Format: Others
Language:zh-TW
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/03540751552342577320
Description
Summary:碩士 === 國立交通大學 === 電信工程研究所 === 102 === This thesis is mainly focus on improving language model in Automatic Speech Recognition(ASR). The studies normalize the training data including combining synonym, variant word, multi-pronunciation. The words are categorized by word class to choose dictionary. Raise the opening word class threshold and reduce the closing word class threshold when choosing dictionary. We also consider the word distribution in training data when choosing word in dictionary. Using syllables to decode to estimate language model whether good or not after training language model. We can find that the recognition rate of WFST is 20 times faster than traditional recognition system at the same recognition rate, hence this thesis is mainly studying how to use Weighted Finite-State Transducer (WFST) to build Large Vocabulary Continuous Mandarin Speech Recognition. We first introduce the algorithm of WFST and represent different ASR layer with WFST also use optimization to minimize WFST. Modify the features when building language model. Finally, we change the size of WFST and the features when recognizing, so we can find the relationship between recognition rate and recognition time.