Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain

碩士 === 國立中山大學 === 資訊工程學系研究所 === 97 === We build a two-pass Mandarin Automatic Speech Recognition (ASR) decoder on mobile device (PDA). The first-pass recognizing base syllable is implemented by discrete Hidden Markov Model (HMM) with time-synchronous, tree-lexicon Viterbi search. The second-pass dea...

Full description

Bibliographic Details
Main Authors:	Bo-han Chen, 陳柏含
Other Authors:	Chia-Ping Chen
Format:	Others
Language:	en_US
Published:	2009
Online Access:	http://ndltd.ncl.edu.tw/handle/mwrs63

id	ndltd-TW-097NSYS5392072
record_format	oai_dc
spelling	ndltd-TW-097NSYS53920722019-05-30T03:49:41Z http://ndltd.ncl.edu.tw/handle/mwrs63 Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain 基於旅遊對話運用嵌入式中文語音辨識系統之實作 Bo-han Chen 陳柏含碩士國立中山大學資訊工程學系研究所 97 We build a two-pass Mandarin Automatic Speech Recognition (ASR) decoder on mobile device (PDA). The first-pass recognizing base syllable is implemented by discrete Hidden Markov Model (HMM) with time-synchronous, tree-lexicon Viterbi search. The second-pass dealing with language model, pronunciation lexicon and N-best syllable hypotheses from first-pass is implemented by Weighted Finite State Transducer (WFST). The best word sequence is obtained by shortest path algorithms over the composition result. This system limits the application in travel domain and it decouples the application of acoustic model and the application of language model into independent recognition passes. We report the real-time recognition performance performed on ASUS P565 with a 800MHz processor, 128MB RAM running Microsoft Window Mobile 6 operating system. The 26-hour TCC-300 speech data is used to train 151 acoustic model. The 3-minute speech data recorded by reading the travel-domain transcriptions is used as the testing set for evaluating the performances (syllable, character accuracies) and real-time factors on PC and on PDA. The trained bi-gram model with 3500-word from BTEC corpus is used in second-pass. In the first-pass, the best syllable accuracy is 38.8% given 30-best syllable hypotheses using continuous HMM and 26-dimension feature. Under the above syllable hypotheses and acoustic model, we obtain 27.6% character accuracy on PC after the second-pass. Chia-Ping Chen 陳嘉平 2009 學位論文 ; thesis 65 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立中山大學 === 資訊工程學系研究所 === 97 === We build a two-pass Mandarin Automatic Speech Recognition (ASR) decoder on mobile device (PDA). The first-pass recognizing base syllable is implemented by discrete Hidden Markov Model (HMM) with time-synchronous, tree-lexicon Viterbi search. The second-pass dealing with language model, pronunciation lexicon and N-best syllable hypotheses from first-pass is implemented by Weighted Finite State Transducer (WFST). The best word sequence is obtained by shortest path algorithms over the composition result. This system limits the application in travel domain and it decouples the application of acoustic model and the application of language model into independent recognition passes. We report the real-time recognition performance performed on ASUS P565 with a 800MHz processor, 128MB RAM running Microsoft Window Mobile 6 operating system. The 26-hour TCC-300 speech data is used to train 151 acoustic model. The 3-minute speech data recorded by reading the travel-domain transcriptions is used as the testing set for evaluating the performances (syllable, character accuracies) and real-time factors on PC and on PDA. The trained bi-gram model with 3500-word from BTEC corpus is used in second-pass. In the first-pass, the best syllable accuracy is 38.8% given 30-best syllable hypotheses using continuous HMM and 26-dimension feature. Under the above syllable hypotheses and acoustic model, we obtain 27.6% character accuracy on PC after the second-pass.
author2	Chia-Ping Chen
author_facet	Chia-Ping Chen Bo-han Chen 陳柏含
author	Bo-han Chen 陳柏含
spellingShingle	Bo-han Chen 陳柏含 Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain
author_sort	Bo-han Chen
title	Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain
title_short	Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain
title_full	Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain
title_fullStr	Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain
title_full_unstemmed	Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain
title_sort	implementation of embedded mandarin speechrecognition system in travel domain
publishDate	2009
url	http://ndltd.ncl.edu.tw/handle/mwrs63
work_keys_str_mv	AT bohanchen implementationofembeddedmandarinspeechrecognitionsystemintraveldomain AT chénbǎihán implementationofembeddedmandarinspeechrecognitionsystemintraveldomain AT bohanchen jīyúlǚyóuduìhuàyùnyòngqiànrùshìzhōngwényǔyīnbiànshíxìtǒngzhīshízuò AT chénbǎihán jīyúlǚyóuduìhuàyùnyòngqiànrùshìzhōngwényǔyīnbiànshíxìtǒngzhīshízuò
_version_	1719193467664465920

Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain

Similar Items