Prosody Hierarchy Construction for Mixed Chinese-English Spelling Speech and its Application to TTS

碩士 === 國立交通大學 === 電信工程研究所 === 99 === In this thesis, an unsupervised joint prosody labeling and modeling (PLM) method for mixed Chinese-English word spelling speech is proposed. It labels an unlabeled corpus with two types of prosodic tags (i.e., break type of inter-syllable juncture and prosodic st...

Full description

Bibliographic Details
Main Authors: Tsai, Cheng-Yeh, 蔡承燁
Other Authors: Chen, Sin-Horng
Format: Others
Language:zh-TW
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/14054085961984270543
id ndltd-TW-099NCTU5435041
record_format oai_dc
spelling ndltd-TW-099NCTU54350412016-04-18T04:21:47Z http://ndltd.ncl.edu.tw/handle/14054085961984270543 Prosody Hierarchy Construction for Mixed Chinese-English Spelling Speech and its Application to TTS 中英夾雜語音之階層式韻律架構建立與語音合成之應用 Tsai, Cheng-Yeh 蔡承燁 碩士 國立交通大學 電信工程研究所 99 In this thesis, an unsupervised joint prosody labeling and modeling (PLM) method for mixed Chinese-English word spelling speech is proposed. It labels an unlabeled corpus with two types of prosodic tags (i.e., break type of inter-syllable juncture and prosodic state of syllable) and builds four prosodic models simultaneously. The break tags can be used to delimit prosodic constituents of a hierarchical prosody structure, and the prosodic state can be used to construct the prosodic feature patterns of prosodic constituents. The four prosodic models describe the relationships of acoustic prosodic features, prosodic tags of utterances, and the linguistic features of the associated texts. The experimental results showed that prosodic variation in English word spelling was influenced by both the prosodic state that describes underlying intonation and Chinese tone borrowing effect. Besides, the relationship between hierarchical noun phrase structure and corresponding break type was also analyzed. The analysis suggested that magnitude of the break type was highly correlated with syntactic hierarchy in a noun phrase. Lastly, we propose two prosody generation methods for mixed Chinese-English word spelling Text-to-Speech system (TTS) based on PLM. In the first method, a break predictor is constrcted by CART method. Then, the related linguistic features and the predicted break tags are used for HMM-based Text-to-Speech system (HTS) training. In the second method, PLM is directly used as a prosody generator. Experimental results confirmed that the proposed method one was superior to the conventional HTS that only use linguistic features both in objective and subjective tests. Besides, the proposed method two was significantly better than the conventional HTS method at syllable duration prediction. Therefore, we conclude that the proposed PLM method was successful in prosody labeling and modeling for constructing a mixed Chinese-English word spelling TTS. Chen, Sin-Horng 陳信宏 2010 學位論文 ; thesis 91 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立交通大學 === 電信工程研究所 === 99 === In this thesis, an unsupervised joint prosody labeling and modeling (PLM) method for mixed Chinese-English word spelling speech is proposed. It labels an unlabeled corpus with two types of prosodic tags (i.e., break type of inter-syllable juncture and prosodic state of syllable) and builds four prosodic models simultaneously. The break tags can be used to delimit prosodic constituents of a hierarchical prosody structure, and the prosodic state can be used to construct the prosodic feature patterns of prosodic constituents. The four prosodic models describe the relationships of acoustic prosodic features, prosodic tags of utterances, and the linguistic features of the associated texts. The experimental results showed that prosodic variation in English word spelling was influenced by both the prosodic state that describes underlying intonation and Chinese tone borrowing effect. Besides, the relationship between hierarchical noun phrase structure and corresponding break type was also analyzed. The analysis suggested that magnitude of the break type was highly correlated with syntactic hierarchy in a noun phrase. Lastly, we propose two prosody generation methods for mixed Chinese-English word spelling Text-to-Speech system (TTS) based on PLM. In the first method, a break predictor is constrcted by CART method. Then, the related linguistic features and the predicted break tags are used for HMM-based Text-to-Speech system (HTS) training. In the second method, PLM is directly used as a prosody generator. Experimental results confirmed that the proposed method one was superior to the conventional HTS that only use linguistic features both in objective and subjective tests. Besides, the proposed method two was significantly better than the conventional HTS method at syllable duration prediction. Therefore, we conclude that the proposed PLM method was successful in prosody labeling and modeling for constructing a mixed Chinese-English word spelling TTS.
author2 Chen, Sin-Horng
author_facet Chen, Sin-Horng
Tsai, Cheng-Yeh
蔡承燁
author Tsai, Cheng-Yeh
蔡承燁
spellingShingle Tsai, Cheng-Yeh
蔡承燁
Prosody Hierarchy Construction for Mixed Chinese-English Spelling Speech and its Application to TTS
author_sort Tsai, Cheng-Yeh
title Prosody Hierarchy Construction for Mixed Chinese-English Spelling Speech and its Application to TTS
title_short Prosody Hierarchy Construction for Mixed Chinese-English Spelling Speech and its Application to TTS
title_full Prosody Hierarchy Construction for Mixed Chinese-English Spelling Speech and its Application to TTS
title_fullStr Prosody Hierarchy Construction for Mixed Chinese-English Spelling Speech and its Application to TTS
title_full_unstemmed Prosody Hierarchy Construction for Mixed Chinese-English Spelling Speech and its Application to TTS
title_sort prosody hierarchy construction for mixed chinese-english spelling speech and its application to tts
publishDate 2010
url http://ndltd.ncl.edu.tw/handle/14054085961984270543
work_keys_str_mv AT tsaichengyeh prosodyhierarchyconstructionformixedchineseenglishspellingspeechanditsapplicationtotts
AT càichéngyè prosodyhierarchyconstructionformixedchineseenglishspellingspeechanditsapplicationtotts
AT tsaichengyeh zhōngyīngjiāzáyǔyīnzhījiēcéngshìyùnlǜjiàgòujiànlìyǔyǔyīnhéchéngzhīyīngyòng
AT càichéngyè zhōngyīngjiāzáyǔyīnzhījiēcéngshìyùnlǜjiàgòujiànlìyǔyǔyīnhéchéngzhīyīngyòng
_version_ 1718227106256650240