An Implementation of HMM-based English Speech Synthesis

碩士 === 國立交通大學 === 電信工程研究所 === 100 === The thesis establishes an online English text to speech system. Using the data base based on a woman whose mother language is China read TOEFL article. First through a good tri-phone model to segment data base, then using CMU dictionary and Stanford-Postagger so...

Full description

Bibliographic Details
Main Authors: Liu, Kuan-Yi, 劉冠驛
Other Authors: Chen, Sin-Horng
Format: Others
Language:zh-TW
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/96185639096441948381
id ndltd-TW-100NCTU5435004
record_format oai_dc
spelling ndltd-TW-100NCTU54350042015-10-13T20:37:27Z http://ndltd.ncl.edu.tw/handle/96185639096441948381 An Implementation of HMM-based English Speech Synthesis 基於隱藏式馬可夫模型之英文語音合成系統實作 Liu, Kuan-Yi 劉冠驛 碩士 國立交通大學 電信工程研究所 100 The thesis establishes an online English text to speech system. Using the data base based on a woman whose mother language is China read TOEFL article. First through a good tri-phone model to segment data base, then using CMU dictionary and Stanford-Postagger software labeled phone, syllable, word, phrase and sentence five level structure relative position and prosodic information, to establish vocal cave, fundamental frequency, and duration model, expected to product more prosody and rhythm. According to experiment result, the synthesized prosody still not natural enough. Although compare with speech synthesized from foreign web site, our prosody is more ripple but more blurred and weird rise and fall. Suppose to use rule based method to estimate variety prosodic labels still not accurate enough. So synthesized speech prosody right in general, but having strange ripple in detail. Chen, Sin-Horng 陳信宏 2011 學位論文 ; thesis 52 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立交通大學 === 電信工程研究所 === 100 === The thesis establishes an online English text to speech system. Using the data base based on a woman whose mother language is China read TOEFL article. First through a good tri-phone model to segment data base, then using CMU dictionary and Stanford-Postagger software labeled phone, syllable, word, phrase and sentence five level structure relative position and prosodic information, to establish vocal cave, fundamental frequency, and duration model, expected to product more prosody and rhythm. According to experiment result, the synthesized prosody still not natural enough. Although compare with speech synthesized from foreign web site, our prosody is more ripple but more blurred and weird rise and fall. Suppose to use rule based method to estimate variety prosodic labels still not accurate enough. So synthesized speech prosody right in general, but having strange ripple in detail.
author2 Chen, Sin-Horng
author_facet Chen, Sin-Horng
Liu, Kuan-Yi
劉冠驛
author Liu, Kuan-Yi
劉冠驛
spellingShingle Liu, Kuan-Yi
劉冠驛
An Implementation of HMM-based English Speech Synthesis
author_sort Liu, Kuan-Yi
title An Implementation of HMM-based English Speech Synthesis
title_short An Implementation of HMM-based English Speech Synthesis
title_full An Implementation of HMM-based English Speech Synthesis
title_fullStr An Implementation of HMM-based English Speech Synthesis
title_full_unstemmed An Implementation of HMM-based English Speech Synthesis
title_sort implementation of hmm-based english speech synthesis
publishDate 2011
url http://ndltd.ncl.edu.tw/handle/96185639096441948381
work_keys_str_mv AT liukuanyi animplementationofhmmbasedenglishspeechsynthesis
AT liúguānyì animplementationofhmmbasedenglishspeechsynthesis
AT liukuanyi jīyúyǐncángshìmǎkěfūmóxíngzhīyīngwényǔyīnhéchéngxìtǒngshízuò
AT liúguānyì jīyúyǐncángshìmǎkěfūmóxíngzhīyīngwényǔyīnhéchéngxìtǒngshízuò
AT liukuanyi implementationofhmmbasedenglishspeechsynthesis
AT liúguānyì implementationofhmmbasedenglishspeechsynthesis
_version_ 1718050003677609984