A Comparison of the Biphone-rich and the Triphone-rich Speech Corpora in Automatic Speech Recognition
碩士 === 國立清華大學 === 統計學研究所 === 92 === We compare in this thesis the performance of a speech recognition system trained with two speech corpora. From the dictionary of the Daiim input method, we select two set of words such that they covered all the cross-syllable biphones and triphones, and are called...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2004
|
Online Access: | http://ndltd.ncl.edu.tw/handle/91814154954977524536 |
id |
ndltd-TW-092NTHU5337013 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-092NTHU53370132015-10-13T13:08:03Z http://ndltd.ncl.edu.tw/handle/91814154954977524536 A Comparison of the Biphone-rich and the Triphone-rich Speech Corpora in Automatic Speech Recognition 三音豐富以及雙音豐富語音資料庫在語音辨識表現之探討 游永昌 碩士 國立清華大學 統計學研究所 92 We compare in this thesis the performance of a speech recognition system trained with two speech corpora. From the dictionary of the Daiim input method, we select two set of words such that they covered all the cross-syllable biphones and triphones, and are called biphone-rich and triphone-rich respectively. It is found that a complete coverage of the cross-syllable triphones requires words of about ten times than that of cross-syllable biphones. To facilitate fair comparison, the biphone-rich corpus is thus consisted of ten sets of words that each covers all the cross-syllable biphones. It is interesting to note that the triphone coverage of this biphone-rich corpus is much lower than that of the triphone-rich set. With those words as transcript, a male Taiwanese speaker recorded all the words as microphone speech. The resulting speech corpora, about 100 minutes for each set, are used to train for the acoustic models. Although both perform quite well in tasks with recognition networks of linear net and free syllable net, the triphone-rich corpus does not show advantages over the biphone-rich corpus. 江永進 2004 學位論文 ; thesis 44 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立清華大學 === 統計學研究所 === 92 === We compare in this thesis the performance of a speech recognition system trained with two speech corpora. From the dictionary of the Daiim input method, we select two set of words such that they covered all the cross-syllable biphones and triphones, and are called biphone-rich and triphone-rich respectively. It is found that a complete coverage of the cross-syllable triphones requires words of about ten times than that of cross-syllable biphones. To facilitate fair comparison, the biphone-rich corpus is thus consisted of ten sets of words that each covers all the cross-syllable biphones. It is interesting to note that the triphone coverage of this biphone-rich corpus is much lower than that of the triphone-rich set.
With those words as transcript, a male Taiwanese speaker recorded all the words as microphone speech. The resulting speech corpora, about 100 minutes for each set, are used to train for the acoustic models. Although both perform quite well in tasks with recognition networks of linear net and free syllable net, the triphone-rich corpus does not show advantages over the biphone-rich corpus.
|
author2 |
江永進 |
author_facet |
江永進 游永昌 |
author |
游永昌 |
spellingShingle |
游永昌 A Comparison of the Biphone-rich and the Triphone-rich Speech Corpora in Automatic Speech Recognition |
author_sort |
游永昌 |
title |
A Comparison of the Biphone-rich and the Triphone-rich Speech Corpora in Automatic Speech Recognition |
title_short |
A Comparison of the Biphone-rich and the Triphone-rich Speech Corpora in Automatic Speech Recognition |
title_full |
A Comparison of the Biphone-rich and the Triphone-rich Speech Corpora in Automatic Speech Recognition |
title_fullStr |
A Comparison of the Biphone-rich and the Triphone-rich Speech Corpora in Automatic Speech Recognition |
title_full_unstemmed |
A Comparison of the Biphone-rich and the Triphone-rich Speech Corpora in Automatic Speech Recognition |
title_sort |
comparison of the biphone-rich and the triphone-rich speech corpora in automatic speech recognition |
publishDate |
2004 |
url |
http://ndltd.ncl.edu.tw/handle/91814154954977524536 |
work_keys_str_mv |
AT yóuyǒngchāng acomparisonofthebiphonerichandthetriphonerichspeechcorporainautomaticspeechrecognition AT yóuyǒngchāng sānyīnfēngfùyǐjíshuāngyīnfēngfùyǔyīnzīliàokùzàiyǔyīnbiànshíbiǎoxiànzhītàntǎo AT yóuyǒngchāng comparisonofthebiphonerichandthetriphonerichspeechcorporainautomaticspeechrecognition |
_version_ |
1717732063453380608 |