Summary: | 碩士 === 淡江大學 === 資訊工程研究所 === 82 === In this paper, we devote to synthesize Chinese word byans of
waveform splicing base on phoneme and pitch period. To
synthesiae speech by concatenation of phoneme or pitchriod
naturally and fluently, there are two important thingsconcemed
: 1. How to capture the fundamental waveform. 2.handle the
situations when splicing. About these problems, we provide some
processes. Theirprovement in quilities of synthesized speech
was confirmed byriments: (1) How to capture the fundamental
waveform: The Quility of synthesized speech is highly dependent
on the selection method of speech waveform while building
speech database. There are two types of speech signal we want
toapture: consonants and vowels. 1. When collecting consonant:
we had retain pure consonant part and some periods of vowel
which adjoin to. In this way, we can make the CV type
concatenation easily and fluently. 2. When collecting vowel:
using FFT Ceptrum to estimate pitch of speech segament. We can
find a maximum amplitude (MA) in the pitch period. Finding the
nearest zero-crossing point before MA as a starting point of
pitch period and regarding the previous point of next period'
srting point as ending point. Repeat the capturess, we retain
nine pitch periods in the range of 3. The high sampling rate is
applied to build speech database. The higher resolution of
speech waveform can reduce the distortion of interpolation
process. Thus the synthesized speech with higher sample-rate
database islosre to original speech than the lower one. (2) How
to handle the situations when splicing: The concatenation
problem take place in these situations: 1. Concatenation of
consonant and vowel (CV type):Concatenation of pitch periods in
a vowel (Vtype):
|