A Technique for Speaker Independent Automatic Speech Recognition Based on Decision Tree State Tying with GCVHMM

碩士 === 國立交通大學 === 電機與控制工程系 === 90 === This paper proposed a new speech recognition technique for continuous speech-independent recognition of spoken Mandarin digits. One popular tool for solving such a problem is the HMM-based one-state algorithm, which is a connected word pattern matchin...

Full description

Bibliographic Details
Main Authors:	Yu-Lin Chou, 周佑霖
Other Authors:	Chin-Teng Lin
Format:	Others
Language:	en_US
Published:	2002
Online Access:	http://ndltd.ncl.edu.tw/handle/42812214292865347038

id	ndltd-TW-090NCTU0591078
record_format	oai_dc
spelling	ndltd-TW-090NCTU05910782015-10-13T10:08:07Z http://ndltd.ncl.edu.tw/handle/42812214292865347038 A Technique for Speaker Independent Automatic Speech Recognition Based on Decision Tree State Tying with GCVHMM 以結合決策樹與GCVHMM為基礎之不特定語者中文連續數字語音辨識 Yu-Lin Chou 周佑霖碩士國立交通大學電機與控制工程系 90 This paper proposed a new speech recognition technique for continuous speech-independent recognition of spoken Mandarin digits. One popular tool for solving such a problem is the HMM-based one-state algorithm, which is a connected word pattern matching method. However, two problems existing in this conventional method prevent it from practical use on our target problem. One is the lack of a proper selection mechanism for robust acoustic models for speaker-independent recognition. The other is the information of intersyllable co-articulatory effect in the acoustic model is contained or not. At first, a generalized common-vector (GCV) approach is developed based on the eigenanalysis of covariance matrix to extract an invariant feature over different speakers as well as the acoustical environment effects and the phase or temporal difference. The GCV scheme is then integrated into the conventional HMM to form the new GCV-based HMM, called GCVHMM, which is good at speaker-independent recognition. For the second problem, context-dependent model is done in order to account for the co-articulatory effects of neighboring phones. It is important because the co-articulatory effect for continuous speech is significantly stronger than that for isolated utterances. However, there must be numerous context-dependent models generated because of modeling the variations of sounds and pronunciations. Furthermore, if the parameters in those models are all distinct, the total number of model parameters would be very huge. To solve the problems above, the decision tree state tying technique is used to reduce the number of parameter, hence reduce the computation complexity. In our experiments on the recognition of speaker-independent continuous speech sentences, the proposed scheme is shown to increase the average recognition rate of the conventional HMM-based one-state algorithm by over 26.039% without using any grammar or lexical information. Chin-Teng Lin Chi-Cheng Jou 林進燈周志成 2002 學位論文 ; thesis 70 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立交通大學 === 電機與控制工程系 === 90 === This paper proposed a new speech recognition technique for continuous speech-independent recognition of spoken Mandarin digits. One popular tool for solving such a problem is the HMM-based one-state algorithm, which is a connected word pattern matching method. However, two problems existing in this conventional method prevent it from practical use on our target problem. One is the lack of a proper selection mechanism for robust acoustic models for speaker-independent recognition. The other is the information of intersyllable co-articulatory effect in the acoustic model is contained or not. At first, a generalized common-vector (GCV) approach is developed based on the eigenanalysis of covariance matrix to extract an invariant feature over different speakers as well as the acoustical environment effects and the phase or temporal difference. The GCV scheme is then integrated into the conventional HMM to form the new GCV-based HMM, called GCVHMM, which is good at speaker-independent recognition. For the second problem, context-dependent model is done in order to account for the co-articulatory effects of neighboring phones. It is important because the co-articulatory effect for continuous speech is significantly stronger than that for isolated utterances. However, there must be numerous context-dependent models generated because of modeling the variations of sounds and pronunciations. Furthermore, if the parameters in those models are all distinct, the total number of model parameters would be very huge. To solve the problems above, the decision tree state tying technique is used to reduce the number of parameter, hence reduce the computation complexity. In our experiments on the recognition of speaker-independent continuous speech sentences, the proposed scheme is shown to increase the average recognition rate of the conventional HMM-based one-state algorithm by over 26.039% without using any grammar or lexical information.
author2	Chin-Teng Lin
author_facet	Chin-Teng Lin Yu-Lin Chou 周佑霖
author	Yu-Lin Chou 周佑霖
spellingShingle	Yu-Lin Chou 周佑霖 A Technique for Speaker Independent Automatic Speech Recognition Based on Decision Tree State Tying with GCVHMM
author_sort	Yu-Lin Chou
title	A Technique for Speaker Independent Automatic Speech Recognition Based on Decision Tree State Tying with GCVHMM
title_short	A Technique for Speaker Independent Automatic Speech Recognition Based on Decision Tree State Tying with GCVHMM
title_full	A Technique for Speaker Independent Automatic Speech Recognition Based on Decision Tree State Tying with GCVHMM
title_fullStr	A Technique for Speaker Independent Automatic Speech Recognition Based on Decision Tree State Tying with GCVHMM
title_full_unstemmed	A Technique for Speaker Independent Automatic Speech Recognition Based on Decision Tree State Tying with GCVHMM
title_sort	technique for speaker independent automatic speech recognition based on decision tree state tying with gcvhmm
publishDate	2002
url	http://ndltd.ncl.edu.tw/handle/42812214292865347038
work_keys_str_mv	AT yulinchou atechniqueforspeakerindependentautomaticspeechrecognitionbasedondecisiontreestatetyingwithgcvhmm AT zhōuyòulín atechniqueforspeakerindependentautomaticspeechrecognitionbasedondecisiontreestatetyingwithgcvhmm AT yulinchou yǐjiéhéjuécèshùyǔgcvhmmwèijīchǔzhībùtèdìngyǔzhězhōngwénliánxùshùzìyǔyīnbiànshí AT zhōuyòulín yǐjiéhéjuécèshùyǔgcvhmmwèijīchǔzhībùtèdìngyǔzhězhōngwénliánxùshùzìyǔyīnbiànshí AT yulinchou techniqueforspeakerindependentautomaticspeechrecognitionbasedondecisiontreestatetyingwithgcvhmm AT zhōuyòulín techniqueforspeakerindependentautomaticspeechrecognitionbasedondecisiontreestatetyingwithgcvhmm
_version_	1716826920044199936

A Technique for Speaker Independent Automatic Speech Recognition Based on Decision Tree State Tying with GCVHMM

Similar Items