A Comparison of the Biphone-rich and the Triphone-rich Speech Corpora in Automatic Speech Recognition

碩士 === 國立清華大學 === 統計學研究所 === 92 === We compare in this thesis the performance of a speech recognition system trained with two speech corpora. From the dictionary of the Daiim input method, we select two set of words such that they covered all the cross-syllable biphones and triphones, and are called...

Full description

Bibliographic Details
Main Author: 游永昌
Other Authors: 江永進
Format: Others
Language:zh-TW
Published: 2004
Online Access:http://ndltd.ncl.edu.tw/handle/91814154954977524536
id ndltd-TW-092NTHU5337013
record_format oai_dc
spelling ndltd-TW-092NTHU53370132015-10-13T13:08:03Z http://ndltd.ncl.edu.tw/handle/91814154954977524536 A Comparison of the Biphone-rich and the Triphone-rich Speech Corpora in Automatic Speech Recognition 三音豐富以及雙音豐富語音資料庫在語音辨識表現之探討 游永昌 碩士 國立清華大學 統計學研究所 92 We compare in this thesis the performance of a speech recognition system trained with two speech corpora. From the dictionary of the Daiim input method, we select two set of words such that they covered all the cross-syllable biphones and triphones, and are called biphone-rich and triphone-rich respectively. It is found that a complete coverage of the cross-syllable triphones requires words of about ten times than that of cross-syllable biphones. To facilitate fair comparison, the biphone-rich corpus is thus consisted of ten sets of words that each covers all the cross-syllable biphones. It is interesting to note that the triphone coverage of this biphone-rich corpus is much lower than that of the triphone-rich set. With those words as transcript, a male Taiwanese speaker recorded all the words as microphone speech. The resulting speech corpora, about 100 minutes for each set, are used to train for the acoustic models. Although both perform quite well in tasks with recognition networks of linear net and free syllable net, the triphone-rich corpus does not show advantages over the biphone-rich corpus. 江永進 2004 學位論文 ; thesis 44 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立清華大學 === 統計學研究所 === 92 === We compare in this thesis the performance of a speech recognition system trained with two speech corpora. From the dictionary of the Daiim input method, we select two set of words such that they covered all the cross-syllable biphones and triphones, and are called biphone-rich and triphone-rich respectively. It is found that a complete coverage of the cross-syllable triphones requires words of about ten times than that of cross-syllable biphones. To facilitate fair comparison, the biphone-rich corpus is thus consisted of ten sets of words that each covers all the cross-syllable biphones. It is interesting to note that the triphone coverage of this biphone-rich corpus is much lower than that of the triphone-rich set. With those words as transcript, a male Taiwanese speaker recorded all the words as microphone speech. The resulting speech corpora, about 100 minutes for each set, are used to train for the acoustic models. Although both perform quite well in tasks with recognition networks of linear net and free syllable net, the triphone-rich corpus does not show advantages over the biphone-rich corpus.
author2 江永進
author_facet 江永進
游永昌
author 游永昌
spellingShingle 游永昌
A Comparison of the Biphone-rich and the Triphone-rich Speech Corpora in Automatic Speech Recognition
author_sort 游永昌
title A Comparison of the Biphone-rich and the Triphone-rich Speech Corpora in Automatic Speech Recognition
title_short A Comparison of the Biphone-rich and the Triphone-rich Speech Corpora in Automatic Speech Recognition
title_full A Comparison of the Biphone-rich and the Triphone-rich Speech Corpora in Automatic Speech Recognition
title_fullStr A Comparison of the Biphone-rich and the Triphone-rich Speech Corpora in Automatic Speech Recognition
title_full_unstemmed A Comparison of the Biphone-rich and the Triphone-rich Speech Corpora in Automatic Speech Recognition
title_sort comparison of the biphone-rich and the triphone-rich speech corpora in automatic speech recognition
publishDate 2004
url http://ndltd.ncl.edu.tw/handle/91814154954977524536
work_keys_str_mv AT yóuyǒngchāng acomparisonofthebiphonerichandthetriphonerichspeechcorporainautomaticspeechrecognition
AT yóuyǒngchāng sānyīnfēngfùyǐjíshuāngyīnfēngfùyǔyīnzīliàokùzàiyǔyīnbiànshíbiǎoxiànzhītàntǎo
AT yóuyǒngchāng comparisonofthebiphonerichandthetriphonerichspeechcorporainautomaticspeechrecognition
_version_ 1717732063453380608