An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition

碩士 === 國立臺灣師範大學 === 資訊工程研究所 === 93 === Discriminative training of acoustic models has been an active focus of much current research in automatic speech recognition (ASR) in the past few years. This thesis extensively investigated the use of the Minimum Phone Error (MPE) approaches for discriminative...

Full description

Bibliographic Details
Main Authors: Jen-Wei Kuo, 郭人瑋
Other Authors: Berlin Chen
Format: Others
Language:zh-TW
Published: 2005
Online Access:http://ndltd.ncl.edu.tw/handle/19056500834940813930
id ndltd-TW-093NTNU5392009
record_format oai_dc
spelling ndltd-TW-093NTNU53920092016-06-03T04:13:43Z http://ndltd.ncl.edu.tw/handle/19056500834940813930 An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition 最小化音素錯誤鑑別式聲學模型學習於中文大詞彙連續語音辨識之初步研究 Jen-Wei Kuo 郭人瑋 碩士 國立臺灣師範大學 資訊工程研究所 93 Discriminative training of acoustic models has been an active focus of much current research in automatic speech recognition (ASR) in the past few years. This thesis extensively investigated the use of the Minimum Phone Error (MPE) approaches for discriminative training and adaptation of acoustic models for Mandarin large vocabulary continuous speech recognition (LVCSR). All experiments were carried out on the Mandarin broadcast news corpus (MATBN). The experimental results show that MPE training can give significant improvements over the baseline systems whose acoustic models were trained based on the Maximum Likelihood (ML), Maximum Mutual Information (MMI) principles. Comparing to the ML-trained acoustic models, relative reductions of 15.52% syllable error rate (SER), 12.33% character error rate (CER) and 10.02% word error rate (WER) were respectively obtained by using the MPE-trained models. Moreover, unsupervised adaptation of acoustic models via the MPE-trained linear transformation in either the model space or the feature space was studied as well with promising results indicated. However, because there was no correct reference transcript that can be used for accuracy calculation and only the top one automatic transcript can be used instead, the unsupervised MPE-based adaptation techniques may not always accumulate good estimates for the acoustic model parameters and thus their performance will be substantially degraded. To tackle this problem, in this thesis a novel Raw Accuracy Prediction Model (RAPM) was proposed to ameliorate the MPE-based adaptation techniques and slight performance gains were initially demonstrated. Berlin Chen 陳柏琳 2005 學位論文 ; thesis 154 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺灣師範大學 === 資訊工程研究所 === 93 === Discriminative training of acoustic models has been an active focus of much current research in automatic speech recognition (ASR) in the past few years. This thesis extensively investigated the use of the Minimum Phone Error (MPE) approaches for discriminative training and adaptation of acoustic models for Mandarin large vocabulary continuous speech recognition (LVCSR). All experiments were carried out on the Mandarin broadcast news corpus (MATBN). The experimental results show that MPE training can give significant improvements over the baseline systems whose acoustic models were trained based on the Maximum Likelihood (ML), Maximum Mutual Information (MMI) principles. Comparing to the ML-trained acoustic models, relative reductions of 15.52% syllable error rate (SER), 12.33% character error rate (CER) and 10.02% word error rate (WER) were respectively obtained by using the MPE-trained models. Moreover, unsupervised adaptation of acoustic models via the MPE-trained linear transformation in either the model space or the feature space was studied as well with promising results indicated. However, because there was no correct reference transcript that can be used for accuracy calculation and only the top one automatic transcript can be used instead, the unsupervised MPE-based adaptation techniques may not always accumulate good estimates for the acoustic model parameters and thus their performance will be substantially degraded. To tackle this problem, in this thesis a novel Raw Accuracy Prediction Model (RAPM) was proposed to ameliorate the MPE-based adaptation techniques and slight performance gains were initially demonstrated.
author2 Berlin Chen
author_facet Berlin Chen
Jen-Wei Kuo
郭人瑋
author Jen-Wei Kuo
郭人瑋
spellingShingle Jen-Wei Kuo
郭人瑋
An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition
author_sort Jen-Wei Kuo
title An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition
title_short An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition
title_full An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition
title_fullStr An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition
title_full_unstemmed An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition
title_sort initial study on minimum phone error discriminative learning of acoustic models for mandarin large vocabulary continuous speech recognition
publishDate 2005
url http://ndltd.ncl.edu.tw/handle/19056500834940813930
work_keys_str_mv AT jenweikuo aninitialstudyonminimumphoneerrordiscriminativelearningofacousticmodelsformandarinlargevocabularycontinuousspeechrecognition
AT guōrénwěi aninitialstudyonminimumphoneerrordiscriminativelearningofacousticmodelsformandarinlargevocabularycontinuousspeechrecognition
AT jenweikuo zuìxiǎohuàyīnsùcuòwùjiànbiéshìshēngxuémóxíngxuéxíyúzhōngwéndàcíhuìliánxùyǔyīnbiànshízhīchūbùyánjiū
AT guōrénwěi zuìxiǎohuàyīnsùcuòwùjiànbiéshìshēngxuémóxíngxuéxíyúzhōngwéndàcíhuìliánxùyǔyīnbiànshízhīchūbùyánjiū
AT jenweikuo initialstudyonminimumphoneerrordiscriminativelearningofacousticmodelsformandarinlargevocabularycontinuousspeechrecognition
AT guōrénwěi initialstudyonminimumphoneerrordiscriminativelearningofacousticmodelsformandarinlargevocabularycontinuousspeechrecognition
_version_ 1718292996131127296