Summary: | 碩士 === 國立中山大學 === 資訊工程學系研究所 === 101 === In this thesis, we propose and implement a concatenation synthesis system to synthesize the singing voice with background music. For all syllables in phonetic symbols word table, we record three different pitches to build our corpus. The synthesis informations, including velocity, note number, start time and end time are extracted from the main melody in MIDI. Runs and riffs information was added into consideration afterward. We use TD-PSOLA to modify the synthesis units in time domain. At last, we add back the background music extracted from MIDI to our synthesis song. We implemented a user interface for users to synthesize songs. This interface can be used to adjust the synthesis songs, for example, adjust the overall pitches in the song, modify syllables, etc. Finally, we did some experiments to evaluate the quality, clarity and similarity of the synthesis songs. The results show that the proposed method achieve better results with simple songs than with fast songs. In our experiments, the synthesis songs are divided into seven categories, including nursery rhymes, folk, lyrical, fast pace, solemn and stirring, Chinese style, Rhythm and blues. The proposed method can feasibly apply other languages, and can be used in humming singing synthesis.
|