An Initial Study on Hakka Speech Synthesis

碩士 === 國立臺灣科技大學 === 資訊工程系 === 90 === In this thesis, speech synthesis for the sub-dialect of Hakka, Hai-Lu, is studied. Since Hai-Lu Hakka has 7 lexical tones and 736 base syllables, it is more complex than other Hakka sub-dialects. Signal waveform synthesis and prosody parameter generation are the...

Full description

Bibliographic Details
Main Authors:	Shie-Jen Lee, 李雪貞
Other Authors:	Hung-yan Gu
Format:	Others
Language:	zh-TW
Published:	2002
Online Access:	http://ndltd.ncl.edu.tw/handle/45818970175515086055

id	ndltd-TW-090NTUST392001
record_format	oai_dc
spelling	ndltd-TW-090NTUST3920012015-10-13T14:41:23Z http://ndltd.ncl.edu.tw/handle/45818970175515086055 An Initial Study on Hakka Speech Synthesis 客語語音合成之初步研究 Shie-Jen Lee 李雪貞碩士國立臺灣科技大學資訊工程系 90 In this thesis, speech synthesis for the sub-dialect of Hakka, Hai-Lu, is studied. Since Hai-Lu Hakka has 7 lexical tones and 736 base syllables, it is more complex than other Hakka sub-dialects. Signal waveform synthesis and prosody parameter generation are the focus of this thesis. For signal waveform synthesis, the method, Time-Proportioned Interpolation of Pitch waveform (TIPW), is adopted and modified to solve the synthesis problem that some Hakka syllables are pronounced with abrupt-tone. Abrupt-tone is not found in Mandarin. For prosody parameter generation, we suppose that the sentence pitch-contour hidden Markov model trained for Mandarin speech synthesis could be directly borrowed to generate pitch-contour parameters for a Hakka sentence if an appropriate tone mapping rule is used. Therefore, we had studied this problem and proposed a Hakka-to-Mandarin mapping rule. According to the perception experiments conducted, the evaluation results show that our supposition is workable. About the parameters of syllable duration and amplitude, generation rules proposed by previous studies are adopted and modified here to have better prosody expression. Using the strategies mentioned above, we had built a prototype system for Hakka speech synthesis. Some simple processing functions for text-analysis are also implemented. With this system, two sets of sentences are synthesized for evaluation experiments. One set of Hakka sentences are synthesized earlier than the other set of Min-Nan sentences when the generation rules for prosodic parameters are not settled. Due to this timing difference and using of a notebook computer’s poor speaker for playing, the comprehension and naturalness scores of Hakka synthetic speech only reach 91.87% and 79.5, respectively. But for the set of Min-Nan synthetic speech, the scores obtained are 97.1% and 85.5 for comprehension and naturalness, respectively. Hung-yan Gu 古鴻炎 2002 學位論文 ; thesis 55 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立臺灣科技大學 === 資訊工程系 === 90 === In this thesis, speech synthesis for the sub-dialect of Hakka, Hai-Lu, is studied. Since Hai-Lu Hakka has 7 lexical tones and 736 base syllables, it is more complex than other Hakka sub-dialects. Signal waveform synthesis and prosody parameter generation are the focus of this thesis. For signal waveform synthesis, the method, Time-Proportioned Interpolation of Pitch waveform (TIPW), is adopted and modified to solve the synthesis problem that some Hakka syllables are pronounced with abrupt-tone. Abrupt-tone is not found in Mandarin. For prosody parameter generation, we suppose that the sentence pitch-contour hidden Markov model trained for Mandarin speech synthesis could be directly borrowed to generate pitch-contour parameters for a Hakka sentence if an appropriate tone mapping rule is used. Therefore, we had studied this problem and proposed a Hakka-to-Mandarin mapping rule. According to the perception experiments conducted, the evaluation results show that our supposition is workable. About the parameters of syllable duration and amplitude, generation rules proposed by previous studies are adopted and modified here to have better prosody expression. Using the strategies mentioned above, we had built a prototype system for Hakka speech synthesis. Some simple processing functions for text-analysis are also implemented. With this system, two sets of sentences are synthesized for evaluation experiments. One set of Hakka sentences are synthesized earlier than the other set of Min-Nan sentences when the generation rules for prosodic parameters are not settled. Due to this timing difference and using of a notebook computer’s poor speaker for playing, the comprehension and naturalness scores of Hakka synthetic speech only reach 91.87% and 79.5, respectively. But for the set of Min-Nan synthetic speech, the scores obtained are 97.1% and 85.5 for comprehension and naturalness, respectively.
author2	Hung-yan Gu
author_facet	Hung-yan Gu Shie-Jen Lee 李雪貞
author	Shie-Jen Lee 李雪貞
spellingShingle	Shie-Jen Lee 李雪貞 An Initial Study on Hakka Speech Synthesis
author_sort	Shie-Jen Lee
title	An Initial Study on Hakka Speech Synthesis
title_short	An Initial Study on Hakka Speech Synthesis
title_full	An Initial Study on Hakka Speech Synthesis
title_fullStr	An Initial Study on Hakka Speech Synthesis
title_full_unstemmed	An Initial Study on Hakka Speech Synthesis
title_sort	initial study on hakka speech synthesis
publishDate	2002
url	http://ndltd.ncl.edu.tw/handle/45818970175515086055
work_keys_str_mv	AT shiejenlee aninitialstudyonhakkaspeechsynthesis AT lǐxuězhēn aninitialstudyonhakkaspeechsynthesis AT shiejenlee kèyǔyǔyīnhéchéngzhīchūbùyánjiū AT lǐxuězhēn kèyǔyǔyīnhéchéngzhīchūbùyánjiū AT shiejenlee initialstudyonhakkaspeechsynthesis AT lǐxuězhēn initialstudyonhakkaspeechsynthesis
_version_	1717756274898108416

An Initial Study on Hakka Speech Synthesis

Similar Items