An Initial Study on Hakka Speech Synthesis
碩士 === 國立臺灣科技大學 === 資訊工程系 === 90 === In this thesis, speech synthesis for the sub-dialect of Hakka, Hai-Lu, is studied. Since Hai-Lu Hakka has 7 lexical tones and 736 base syllables, it is more complex than other Hakka sub-dialects. Signal waveform synthesis and prosody parameter generation are the...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2002
|
Online Access: | http://ndltd.ncl.edu.tw/handle/45818970175515086055 |
id |
ndltd-TW-090NTUST392001 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-090NTUST3920012015-10-13T14:41:23Z http://ndltd.ncl.edu.tw/handle/45818970175515086055 An Initial Study on Hakka Speech Synthesis 客語語音合成之初步研究 Shie-Jen Lee 李雪貞 碩士 國立臺灣科技大學 資訊工程系 90 In this thesis, speech synthesis for the sub-dialect of Hakka, Hai-Lu, is studied. Since Hai-Lu Hakka has 7 lexical tones and 736 base syllables, it is more complex than other Hakka sub-dialects. Signal waveform synthesis and prosody parameter generation are the focus of this thesis. For signal waveform synthesis, the method, Time-Proportioned Interpolation of Pitch waveform (TIPW), is adopted and modified to solve the synthesis problem that some Hakka syllables are pronounced with abrupt-tone. Abrupt-tone is not found in Mandarin. For prosody parameter generation, we suppose that the sentence pitch-contour hidden Markov model trained for Mandarin speech synthesis could be directly borrowed to generate pitch-contour parameters for a Hakka sentence if an appropriate tone mapping rule is used. Therefore, we had studied this problem and proposed a Hakka-to-Mandarin mapping rule. According to the perception experiments conducted, the evaluation results show that our supposition is workable. About the parameters of syllable duration and amplitude, generation rules proposed by previous studies are adopted and modified here to have better prosody expression. Using the strategies mentioned above, we had built a prototype system for Hakka speech synthesis. Some simple processing functions for text-analysis are also implemented. With this system, two sets of sentences are synthesized for evaluation experiments. One set of Hakka sentences are synthesized earlier than the other set of Min-Nan sentences when the generation rules for prosodic parameters are not settled. Due to this timing difference and using of a notebook computer’s poor speaker for playing, the comprehension and naturalness scores of Hakka synthetic speech only reach 91.87% and 79.5, respectively. But for the set of Min-Nan synthetic speech, the scores obtained are 97.1% and 85.5 for comprehension and naturalness, respectively. Hung-yan Gu 古鴻炎 2002 學位論文 ; thesis 55 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣科技大學 === 資訊工程系 === 90 === In this thesis, speech synthesis for the sub-dialect of Hakka, Hai-Lu, is studied. Since Hai-Lu Hakka has 7 lexical tones and 736 base syllables, it is more complex than other Hakka sub-dialects. Signal waveform synthesis and prosody parameter generation are the focus of this thesis. For signal waveform synthesis, the method, Time-Proportioned Interpolation of Pitch waveform (TIPW), is adopted and modified to solve the synthesis problem that some Hakka syllables are pronounced with abrupt-tone. Abrupt-tone is not found in Mandarin. For prosody parameter generation, we suppose that the sentence pitch-contour hidden Markov model trained for Mandarin speech synthesis could be directly borrowed to generate pitch-contour parameters for a Hakka sentence if an appropriate tone mapping rule is used. Therefore, we had studied this problem and proposed a Hakka-to-Mandarin mapping rule. According to the perception experiments conducted, the evaluation results show that our supposition is workable. About the parameters of syllable duration and amplitude, generation rules proposed by previous studies are adopted and modified here to have better prosody expression.
Using the strategies mentioned above, we had built a prototype system for Hakka speech synthesis. Some simple processing functions for text-analysis are also implemented. With this system, two sets of sentences are synthesized for evaluation experiments. One set of Hakka sentences are synthesized earlier than the other set of Min-Nan sentences when the generation rules for prosodic parameters are not settled. Due to this timing difference and using of a notebook computer’s poor speaker for playing, the comprehension and naturalness scores of Hakka synthetic speech only reach 91.87% and 79.5, respectively. But for the set of Min-Nan synthetic speech, the scores obtained are 97.1% and 85.5 for comprehension and naturalness, respectively.
|
author2 |
Hung-yan Gu |
author_facet |
Hung-yan Gu Shie-Jen Lee 李雪貞 |
author |
Shie-Jen Lee 李雪貞 |
spellingShingle |
Shie-Jen Lee 李雪貞 An Initial Study on Hakka Speech Synthesis |
author_sort |
Shie-Jen Lee |
title |
An Initial Study on Hakka Speech Synthesis |
title_short |
An Initial Study on Hakka Speech Synthesis |
title_full |
An Initial Study on Hakka Speech Synthesis |
title_fullStr |
An Initial Study on Hakka Speech Synthesis |
title_full_unstemmed |
An Initial Study on Hakka Speech Synthesis |
title_sort |
initial study on hakka speech synthesis |
publishDate |
2002 |
url |
http://ndltd.ncl.edu.tw/handle/45818970175515086055 |
work_keys_str_mv |
AT shiejenlee aninitialstudyonhakkaspeechsynthesis AT lǐxuězhēn aninitialstudyonhakkaspeechsynthesis AT shiejenlee kèyǔyǔyīnhéchéngzhīchūbùyánjiū AT lǐxuězhēn kèyǔyǔyīnhéchéngzhīchūbùyánjiū AT shiejenlee initialstudyonhakkaspeechsynthesis AT lǐxuězhēn initialstudyonhakkaspeechsynthesis |
_version_ |
1717756274898108416 |