An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition
碩士 === 國立臺灣師範大學 === 資訊工程研究所 === 93 === Discriminative training of acoustic models has been an active focus of much current research in automatic speech recognition (ASR) in the past few years. This thesis extensively investigated the use of the Minimum Phone Error (MPE) approaches for discriminative...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2005
|
Online Access: | http://ndltd.ncl.edu.tw/handle/19056500834940813930 |
id |
ndltd-TW-093NTNU5392009 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-093NTNU53920092016-06-03T04:13:43Z http://ndltd.ncl.edu.tw/handle/19056500834940813930 An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition 最小化音素錯誤鑑別式聲學模型學習於中文大詞彙連續語音辨識之初步研究 Jen-Wei Kuo 郭人瑋 碩士 國立臺灣師範大學 資訊工程研究所 93 Discriminative training of acoustic models has been an active focus of much current research in automatic speech recognition (ASR) in the past few years. This thesis extensively investigated the use of the Minimum Phone Error (MPE) approaches for discriminative training and adaptation of acoustic models for Mandarin large vocabulary continuous speech recognition (LVCSR). All experiments were carried out on the Mandarin broadcast news corpus (MATBN). The experimental results show that MPE training can give significant improvements over the baseline systems whose acoustic models were trained based on the Maximum Likelihood (ML), Maximum Mutual Information (MMI) principles. Comparing to the ML-trained acoustic models, relative reductions of 15.52% syllable error rate (SER), 12.33% character error rate (CER) and 10.02% word error rate (WER) were respectively obtained by using the MPE-trained models. Moreover, unsupervised adaptation of acoustic models via the MPE-trained linear transformation in either the model space or the feature space was studied as well with promising results indicated. However, because there was no correct reference transcript that can be used for accuracy calculation and only the top one automatic transcript can be used instead, the unsupervised MPE-based adaptation techniques may not always accumulate good estimates for the acoustic model parameters and thus their performance will be substantially degraded. To tackle this problem, in this thesis a novel Raw Accuracy Prediction Model (RAPM) was proposed to ameliorate the MPE-based adaptation techniques and slight performance gains were initially demonstrated. Berlin Chen 陳柏琳 2005 學位論文 ; thesis 154 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣師範大學 === 資訊工程研究所 === 93 === Discriminative training of acoustic models has been an active focus of much current research in automatic speech recognition (ASR) in the past few years. This thesis extensively investigated the use of the Minimum Phone Error (MPE) approaches for discriminative training and adaptation of acoustic models for Mandarin large vocabulary continuous speech recognition (LVCSR). All experiments were carried out on the Mandarin broadcast news corpus (MATBN). The experimental results show that MPE training can give significant improvements over the baseline systems whose acoustic models were trained based on the Maximum Likelihood (ML), Maximum Mutual Information (MMI) principles. Comparing to the ML-trained acoustic models, relative reductions of 15.52% syllable error rate (SER), 12.33% character error rate (CER) and 10.02% word error rate (WER) were respectively obtained by using the MPE-trained models. Moreover, unsupervised adaptation of acoustic models via the MPE-trained linear transformation in either the model space or the feature space was studied as well with promising results indicated. However, because there was no correct reference transcript that can be used for accuracy calculation and only the top one automatic transcript can be used instead, the unsupervised MPE-based adaptation techniques may not always accumulate good estimates for the acoustic model parameters and thus their performance will be substantially degraded. To tackle this problem, in this thesis a novel Raw Accuracy Prediction Model (RAPM) was proposed to ameliorate the MPE-based adaptation techniques and slight performance gains were initially demonstrated.
|
author2 |
Berlin Chen |
author_facet |
Berlin Chen Jen-Wei Kuo 郭人瑋 |
author |
Jen-Wei Kuo 郭人瑋 |
spellingShingle |
Jen-Wei Kuo 郭人瑋 An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition |
author_sort |
Jen-Wei Kuo |
title |
An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition |
title_short |
An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition |
title_full |
An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition |
title_fullStr |
An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition |
title_full_unstemmed |
An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition |
title_sort |
initial study on minimum phone error discriminative learning of acoustic models for mandarin large vocabulary continuous speech recognition |
publishDate |
2005 |
url |
http://ndltd.ncl.edu.tw/handle/19056500834940813930 |
work_keys_str_mv |
AT jenweikuo aninitialstudyonminimumphoneerrordiscriminativelearningofacousticmodelsformandarinlargevocabularycontinuousspeechrecognition AT guōrénwěi aninitialstudyonminimumphoneerrordiscriminativelearningofacousticmodelsformandarinlargevocabularycontinuousspeechrecognition AT jenweikuo zuìxiǎohuàyīnsùcuòwùjiànbiéshìshēngxuémóxíngxuéxíyúzhōngwéndàcíhuìliánxùyǔyīnbiànshízhīchūbùyánjiū AT guōrénwěi zuìxiǎohuàyīnsùcuòwùjiànbiéshìshēngxuémóxíngxuéxíyúzhōngwéndàcíhuìliánxùyǔyīnbiànshízhīchūbùyánjiū AT jenweikuo initialstudyonminimumphoneerrordiscriminativelearningofacousticmodelsformandarinlargevocabularycontinuousspeechrecognition AT guōrénwěi initialstudyonminimumphoneerrordiscriminativelearningofacousticmodelsformandarinlargevocabularycontinuousspeechrecognition |
_version_ |
1718292996131127296 |