Speaker Adaptation for Mandarin Syllable Recognition Based on Semi-Continuous Density HMM

碩士 === 國立交通大學 === 資訊工程研究所 === 83 === A speaker-dependent speech recognition system performs high recognition rate, but it needs a lot of speaker-specific training data. A speaker-independent (or multi-speaker) system needs no tr...

Full description

Bibliographic Details
Main Authors:	Chien-Hung Chen, 陳健宏
Other Authors:	Chi-Min Liu
Format:	Others
Language:	zh-TW
Published:	1995
Online Access:	http://ndltd.ncl.edu.tw/handle/66731320553819607466

id	ndltd-TW-083NCTU0392064
record_format	oai_dc
spelling	ndltd-TW-083NCTU03920642015-10-13T12:53:37Z http://ndltd.ncl.edu.tw/handle/66731320553819607466 Speaker Adaptation for Mandarin Syllable Recognition Based on Semi-Continuous Density HMM 利用半連續型隱藏式馬可夫模型建立的語者調適之中文語音辨識系統 Chien-Hung Chen 陳健宏碩士國立交通大學資訊工程研究所 83 A speaker-dependent speech recognition system performs high recognition rate, but it needs a lot of speaker-specific training data. A speaker-independent (or multi-speaker) system needs no training data from speakers, and it cannot get satis- -factory performance usually. A speaker-adaptive system uses the existing knowledge from a reliably trained reference system, so that a small amount of new speaker's training data is suffi- cient to reach the performance of speaker-dependent system. In this thesis, we consider the applying of speaker adaptation techniques in Mandarin speech. The vocabulary we study has 76 syllables, which include 19 INITIALs and 4 FINALs from the confusing sets in Mandarin syllables. For the reference systems in speaker adaptation, we create speaker-dependent and speaker- independent systems based on the semi-continuous density hidden Markov model (SCHMM). The speaker-dependent system has an aver- -age recognition rate 90.46% and the speaker-independent system 58.97%. On the basis of the two reference systems, we study the Bayesian adaptation techniques with the forward-backward training procedure. We apply the adaptation techniques to adjust codebooks, mixture weights, and transition probabilities in SCHMM. Experiment results show that the adaptation procedure achieves better performance than that of the speaker-independent system with only one training token, it raises recognition rate from 58.97% to 76.65 %. When 3 training tokens are used, the recognition rate approximates that of the speaker-dependent system. When using 6 training tokens, the recognition rate achieves better than that of the speaker-dependent system. Chi-Min Liu 劉啟民 1995 學位論文 ; thesis 70 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立交通大學 === 資訊工程研究所 === 83 === A speaker-dependent speech recognition system performs high recognition rate, but it needs a lot of speaker-specific training data. A speaker-independent (or multi-speaker) system needs no training data from speakers, and it cannot get satis- -factory performance usually. A speaker-adaptive system uses the existing knowledge from a reliably trained reference system, so that a small amount of new speaker's training data is suffi- cient to reach the performance of speaker-dependent system. In this thesis, we consider the applying of speaker adaptation techniques in Mandarin speech. The vocabulary we study has 76 syllables, which include 19 INITIALs and 4 FINALs from the confusing sets in Mandarin syllables. For the reference systems in speaker adaptation, we create speaker-dependent and speaker- independent systems based on the semi-continuous density hidden Markov model (SCHMM). The speaker-dependent system has an aver- -age recognition rate 90.46% and the speaker-independent system 58.97%. On the basis of the two reference systems, we study the Bayesian adaptation techniques with the forward-backward training procedure. We apply the adaptation techniques to adjust codebooks, mixture weights, and transition probabilities in SCHMM. Experiment results show that the adaptation procedure achieves better performance than that of the speaker-independent system with only one training token, it raises recognition rate from 58.97% to 76.65 %. When 3 training tokens are used, the recognition rate approximates that of the speaker-dependent system. When using 6 training tokens, the recognition rate achieves better than that of the speaker-dependent system.
author2	Chi-Min Liu
author_facet	Chi-Min Liu Chien-Hung Chen 陳健宏
author	Chien-Hung Chen 陳健宏
spellingShingle	Chien-Hung Chen 陳健宏 Speaker Adaptation for Mandarin Syllable Recognition Based on Semi-Continuous Density HMM
author_sort	Chien-Hung Chen
title	Speaker Adaptation for Mandarin Syllable Recognition Based on Semi-Continuous Density HMM
title_short	Speaker Adaptation for Mandarin Syllable Recognition Based on Semi-Continuous Density HMM
title_full	Speaker Adaptation for Mandarin Syllable Recognition Based on Semi-Continuous Density HMM
title_fullStr	Speaker Adaptation for Mandarin Syllable Recognition Based on Semi-Continuous Density HMM
title_full_unstemmed	Speaker Adaptation for Mandarin Syllable Recognition Based on Semi-Continuous Density HMM
title_sort	speaker adaptation for mandarin syllable recognition based on semi-continuous density hmm
publishDate	1995
url	http://ndltd.ncl.edu.tw/handle/66731320553819607466
work_keys_str_mv	AT chienhungchen speakeradaptationformandarinsyllablerecognitionbasedonsemicontinuousdensityhmm AT chénjiànhóng speakeradaptationformandarinsyllablerecognitionbasedonsemicontinuousdensityhmm AT chienhungchen lìyòngbànliánxùxíngyǐncángshìmǎkěfūmóxíngjiànlìdeyǔzhědiàoshìzhīzhōngwényǔyīnbiànshíxìtǒng AT chénjiànhóng lìyòngbànliánxùxíngyǐncángshìmǎkěfūmóxíngjiànlìdeyǔzhědiàoshìzhīzhōngwényǔyīnbiànshíxìtǒng
_version_	1716868648158625792

Speaker Adaptation for Mandarin Syllable Recognition Based on Semi-Continuous Density HMM

Similar Items