Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain

碩士 === 國立中山大學 === 資訊工程學系研究所 === 97 === We build a two-pass Mandarin Automatic Speech Recognition (ASR) decoder on mobile device (PDA). The first-pass recognizing base syllable is implemented by discrete Hidden Markov Model (HMM) with time-synchronous, tree-lexicon Viterbi search. The second-pass dea...

Full description

Bibliographic Details
Main Authors: Bo-han Chen, 陳柏含
Other Authors: Chia-Ping Chen
Format: Others
Language:en_US
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/mwrs63
id ndltd-TW-097NSYS5392072
record_format oai_dc
spelling ndltd-TW-097NSYS53920722019-05-30T03:49:41Z http://ndltd.ncl.edu.tw/handle/mwrs63 Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain 基於旅遊對話運用嵌入式中文語音辨識系統之實作 Bo-han Chen 陳柏含 碩士 國立中山大學 資訊工程學系研究所 97 We build a two-pass Mandarin Automatic Speech Recognition (ASR) decoder on mobile device (PDA). The first-pass recognizing base syllable is implemented by discrete Hidden Markov Model (HMM) with time-synchronous, tree-lexicon Viterbi search. The second-pass dealing with language model, pronunciation lexicon and N-best syllable hypotheses from first-pass is implemented by Weighted Finite State Transducer (WFST). The best word sequence is obtained by shortest path algorithms over the composition result. This system limits the application in travel domain and it decouples the application of acoustic model and the application of language model into independent recognition passes. We report the real-time recognition performance performed on ASUS P565 with a 800MHz processor, 128MB RAM running Microsoft Window Mobile 6 operating system. The 26-hour TCC-300 speech data is used to train 151 acoustic model. The 3-minute speech data recorded by reading the travel-domain transcriptions is used as the testing set for evaluating the performances (syllable, character accuracies) and real-time factors on PC and on PDA. The trained bi-gram model with 3500-word from BTEC corpus is used in second-pass. In the first-pass, the best syllable accuracy is 38.8% given 30-best syllable hypotheses using continuous HMM and 26-dimension feature. Under the above syllable hypotheses and acoustic model, we obtain 27.6% character accuracy on PC after the second-pass. Chia-Ping Chen 陳嘉平 2009 學位論文 ; thesis 65 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立中山大學 === 資訊工程學系研究所 === 97 === We build a two-pass Mandarin Automatic Speech Recognition (ASR) decoder on mobile device (PDA). The first-pass recognizing base syllable is implemented by discrete Hidden Markov Model (HMM) with time-synchronous, tree-lexicon Viterbi search. The second-pass dealing with language model, pronunciation lexicon and N-best syllable hypotheses from first-pass is implemented by Weighted Finite State Transducer (WFST). The best word sequence is obtained by shortest path algorithms over the composition result. This system limits the application in travel domain and it decouples the application of acoustic model and the application of language model into independent recognition passes. We report the real-time recognition performance performed on ASUS P565 with a 800MHz processor, 128MB RAM running Microsoft Window Mobile 6 operating system. The 26-hour TCC-300 speech data is used to train 151 acoustic model. The 3-minute speech data recorded by reading the travel-domain transcriptions is used as the testing set for evaluating the performances (syllable, character accuracies) and real-time factors on PC and on PDA. The trained bi-gram model with 3500-word from BTEC corpus is used in second-pass. In the first-pass, the best syllable accuracy is 38.8% given 30-best syllable hypotheses using continuous HMM and 26-dimension feature. Under the above syllable hypotheses and acoustic model, we obtain 27.6% character accuracy on PC after the second-pass.
author2 Chia-Ping Chen
author_facet Chia-Ping Chen
Bo-han Chen
陳柏含
author Bo-han Chen
陳柏含
spellingShingle Bo-han Chen
陳柏含
Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain
author_sort Bo-han Chen
title Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain
title_short Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain
title_full Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain
title_fullStr Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain
title_full_unstemmed Implementation of Embedded Mandarin SpeechRecognition System in Travel Domain
title_sort implementation of embedded mandarin speechrecognition system in travel domain
publishDate 2009
url http://ndltd.ncl.edu.tw/handle/mwrs63
work_keys_str_mv AT bohanchen implementationofembeddedmandarinspeechrecognitionsystemintraveldomain
AT chénbǎihán implementationofembeddedmandarinspeechrecognitionsystemintraveldomain
AT bohanchen jīyúlǚyóuduìhuàyùnyòngqiànrùshìzhōngwényǔyīnbiànshíxìtǒngzhīshízuò
AT chénbǎihán jīyúlǚyóuduìhuàyùnyòngqiànrùshìzhōngwényǔyīnbiànshíxìtǒngzhīshízuò
_version_ 1719193467664465920