An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition

碩士 === 國立臺灣師範大學 === 資訊工程研究所 === 93 === Discriminative training of acoustic models has been an active focus of much current research in automatic speech recognition (ASR) in the past few years. This thesis extensively investigated the use of the Minimum Phone Error (MPE) approaches for discriminative...

Full description

Bibliographic Details
Main Authors:	Jen-Wei Kuo, 郭人瑋
Other Authors:	Berlin Chen
Format:	Others
Language:	zh-TW
Published:	2005
Online Access:	http://ndltd.ncl.edu.tw/handle/19056500834940813930

id	ndltd-TW-093NTNU5392009
record_format	oai_dc
spelling	ndltd-TW-093NTNU53920092016-06-03T04:13:43Z http://ndltd.ncl.edu.tw/handle/19056500834940813930 An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition 最小化音素錯誤鑑別式聲學模型學習於中文大詞彙連續語音辨識之初步研究 Jen-Wei Kuo 郭人瑋碩士國立臺灣師範大學資訊工程研究所 93 Discriminative training of acoustic models has been an active focus of much current research in automatic speech recognition (ASR) in the past few years. This thesis extensively investigated the use of the Minimum Phone Error (MPE) approaches for discriminative training and adaptation of acoustic models for Mandarin large vocabulary continuous speech recognition (LVCSR). All experiments were carried out on the Mandarin broadcast news corpus (MATBN). The experimental results show that MPE training can give significant improvements over the baseline systems whose acoustic models were trained based on the Maximum Likelihood (ML), Maximum Mutual Information (MMI) principles. Comparing to the ML-trained acoustic models, relative reductions of 15.52% syllable error rate (SER), 12.33% character error rate (CER) and 10.02% word error rate (WER) were respectively obtained by using the MPE-trained models. Moreover, unsupervised adaptation of acoustic models via the MPE-trained linear transformation in either the model space or the feature space was studied as well with promising results indicated. However, because there was no correct reference transcript that can be used for accuracy calculation and only the top one automatic transcript can be used instead, the unsupervised MPE-based adaptation techniques may not always accumulate good estimates for the acoustic model parameters and thus their performance will be substantially degraded. To tackle this problem, in this thesis a novel Raw Accuracy Prediction Model (RAPM) was proposed to ameliorate the MPE-based adaptation techniques and slight performance gains were initially demonstrated. Berlin Chen 陳柏琳 2005 學位論文 ; thesis 154 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立臺灣師範大學 === 資訊工程研究所 === 93 === Discriminative training of acoustic models has been an active focus of much current research in automatic speech recognition (ASR) in the past few years. This thesis extensively investigated the use of the Minimum Phone Error (MPE) approaches for discriminative training and adaptation of acoustic models for Mandarin large vocabulary continuous speech recognition (LVCSR). All experiments were carried out on the Mandarin broadcast news corpus (MATBN). The experimental results show that MPE training can give significant improvements over the baseline systems whose acoustic models were trained based on the Maximum Likelihood (ML), Maximum Mutual Information (MMI) principles. Comparing to the ML-trained acoustic models, relative reductions of 15.52% syllable error rate (SER), 12.33% character error rate (CER) and 10.02% word error rate (WER) were respectively obtained by using the MPE-trained models. Moreover, unsupervised adaptation of acoustic models via the MPE-trained linear transformation in either the model space or the feature space was studied as well with promising results indicated. However, because there was no correct reference transcript that can be used for accuracy calculation and only the top one automatic transcript can be used instead, the unsupervised MPE-based adaptation techniques may not always accumulate good estimates for the acoustic model parameters and thus their performance will be substantially degraded. To tackle this problem, in this thesis a novel Raw Accuracy Prediction Model (RAPM) was proposed to ameliorate the MPE-based adaptation techniques and slight performance gains were initially demonstrated.
author2	Berlin Chen
author_facet	Berlin Chen Jen-Wei Kuo 郭人瑋
author	Jen-Wei Kuo 郭人瑋
spellingShingle	Jen-Wei Kuo 郭人瑋 An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition
author_sort	Jen-Wei Kuo
title	An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition
title_short	An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition
title_full	An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition
title_fullStr	An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition
title_full_unstemmed	An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition
title_sort	initial study on minimum phone error discriminative learning of acoustic models for mandarin large vocabulary continuous speech recognition
publishDate	2005
url	http://ndltd.ncl.edu.tw/handle/19056500834940813930
work_keys_str_mv	AT jenweikuo aninitialstudyonminimumphoneerrordiscriminativelearningofacousticmodelsformandarinlargevocabularycontinuousspeechrecognition AT guōrénwěi aninitialstudyonminimumphoneerrordiscriminativelearningofacousticmodelsformandarinlargevocabularycontinuousspeechrecognition AT jenweikuo zuìxiǎohuàyīnsùcuòwùjiànbiéshìshēngxuémóxíngxuéxíyúzhōngwéndàcíhuìliánxùyǔyīnbiànshízhīchūbùyánjiū AT guōrénwěi zuìxiǎohuàyīnsùcuòwùjiànbiéshìshēngxuémóxíngxuéxíyúzhōngwéndàcíhuìliánxùyǔyīnbiànshízhīchūbùyánjiū AT jenweikuo initialstudyonminimumphoneerrordiscriminativelearningofacousticmodelsformandarinlargevocabularycontinuousspeechrecognition AT guōrénwěi initialstudyonminimumphoneerrordiscriminativelearningofacousticmodelsformandarinlargevocabularycontinuousspeechrecognition
_version_	1718292996131127296

An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition

Similar Items