Summary: | 碩士 === 國立臺灣師範大學 === 資訊工程研究所 === 98 === This thesis sets the goal at investigating the consistency properties underlying the most popular algorithms for discriminative training of acoustic models. Various margin- and boosting-based training data selection methods are also extensively explored in conjunction with the discriminative training algorithms for Mandarin large vocabulary continuous speech recognition (LVCSR). First, for providing an in-depth evaluation of the utilities of the discriminative acoustic model training algorithms developed recently, we try to deduce the consistency properties from their individual training objectives. Second, we compare among different margin- and boosting-based methods that have the abilities to make acoustic training concentrate more on discriminative training data so as to effectively enhance the LVCSR performance. Furthermore, we also attempt to pair the soft-margin- with the boosting-based methods to make good use of more discriminative statistics, while the implementation is instantiated by utterance-level data selection. All experiments are conducted on a Mandarin broadcast news corpus compiled in Taiwan, and the associated results seem to demonstrate that the proposed approaches could relieve the over-training problem to a certain extent.
|