Phone-based speech synthesis using neural network with articulatory control.

by Lo Wai Kit. === Thesis (M.Phil.)--Chinese University of Hong Kong, 1996. === Includes bibliographical references (leaves 151-160). === Chapter 1 --- Introduction --- p.1 === Chapter 1.1 --- Applications of Speech Synthesis --- p.2 === Chapter 1.1.1 --- Human Machine Interface --- p.2 === Chapt...

Full description

Bibliographic Details
Other Authors: Lo, Wai Kit.
Format: Others
Language:English
Published: Chinese University of Hong Kong 1996
Subjects:
Online Access:http://library.cuhk.edu.hk/record=b5895723
http://repository.lib.cuhk.edu.hk/en/item/cuhk-321513
Description
Summary:by Lo Wai Kit. === Thesis (M.Phil.)--Chinese University of Hong Kong, 1996. === Includes bibliographical references (leaves 151-160). === Chapter 1 --- Introduction --- p.1 === Chapter 1.1 --- Applications of Speech Synthesis --- p.2 === Chapter 1.1.1 --- Human Machine Interface --- p.2 === Chapter 1.1.2 --- Speech Aids --- p.3 === Chapter 1.1.3 --- Text-To-Speech (TTS) system --- p.4 === Chapter 1.1.4 --- Speech Dialogue System --- p.4 === Chapter 1.2 --- Current Status in Speech Synthesis --- p.6 === Chapter 1.2.1 --- Concatenation Based --- p.6 === Chapter 1.2.2 --- Parametric Based --- p.7 === Chapter 1.2.3 --- Articulatory Based --- p.7 === Chapter 1.2.4 --- Application of Neural Network in Speech Synthesis --- p.8 === Chapter 1.3 --- The Proposed Neural Network Speech Synthesis --- p.9 === Chapter 1.3.1 --- Motivation --- p.9 === Chapter 1.3.2 --- Objectives --- p.9 === Chapter 1.4 --- Thesis outline --- p.11 === Chapter 2 --- Linguistic Basics for Speech Synthesis --- p.12 === Chapter 2.1 --- Relations between Linguistic and Speech Synthesis --- p.12 === Chapter 2.2 --- Basic Phonology and Phonetics --- p.14 === Chapter 2.2.1 --- Phonology --- p.14 === Chapter 2.2.2 --- Phonetics --- p.15 === Chapter 2.2.3 --- Prosody --- p.16 === Chapter 2.3 --- Transcription Systems --- p.17 === Chapter 2.3.1 --- The Employed Transcription System --- p.18 === Chapter 2.4 --- Cantonese Phonology --- p.20 === Chapter 2.4.1 --- Some Properties of Cantonese --- p.20 === Chapter 2.4.2 --- Initial --- p.21 === Chapter 2.4.3 --- Final --- p.23 === Chapter 2.4.4 --- Lexical Tone --- p.25 === Chapter 2.4.5 --- Variations --- p.26 === Chapter 2.5 --- The Vowel Quadrilaterals --- p.29 === Chapter 3 --- Speech Synthesis Technology --- p.32 === Chapter 3.1 --- The Human Speech Production --- p.32 === Chapter 3.2 --- Important Issues in Speech Synthesis System --- p.34 === Chapter 3.2.1 --- Controllability --- p.34 === Chapter 3.2.2 --- Naturalness --- p.34 === Chapter 3.2.3 --- Complexity --- p.35 === Chapter 3.2.4 --- Information Storage --- p.35 === Chapter 3.3 --- Units for Synthesis --- p.37 === Chapter 3.4 --- Type of Synthesizer --- p.40 === Chapter 3.4.1 --- Copy Concatenation --- p.40 === Chapter 3.4.2 --- Vocoder --- p.41 === Chapter 3.4.3 --- Articulatory Synthesis --- p.44 === Chapter 4 --- Neural Network Speech Synthesis with Articulatory Control --- p.47 === Chapter 4.1 --- Neural Network Approximation --- p.48 === Chapter 4.1.1 --- The Approximation Problem --- p.48 === Chapter 4.1.2 --- Network Approach for Approximation --- p.49 === Chapter 4.2 --- Artificial Neural Network for Phone-based Speech Synthesis --- p.53 === Chapter 4.2.1 --- Network Approximation for Speech Signal Synthesis --- p.53 === Chapter 4.2.2 --- Feed forward Backpropagation Neural Network --- p.56 === Chapter 4.2.3 --- Radial Basis Function Network --- p.58 === Chapter 4.2.4 --- Parallel Operating Synthesizer Networks --- p.59 === Chapter 4.3 --- Template Storage and Control for the Synthesizer Network --- p.61 === Chapter 4.3.1 --- Implicit Template Storage --- p.61 === Chapter 4.3.2 --- Articulatory Control Parameters --- p.61 === Chapter 4.4 --- Summary --- p.65 === Chapter 5 --- Prototype Implementation of the Synthesizer Network --- p.66 === Chapter 5.1 --- Implementation of the Synthesizer Network --- p.66 === Chapter 5.1.1 --- Network Architectures --- p.68 === Chapter 5.1.2 --- Spectral Templates for Training --- p.74 === Chapter 5.1.3 --- System requirement --- p.76 === Chapter 5.2 --- Subjective Listening Test --- p.79 === Chapter 5.2.1 --- Sample Selection --- p.79 === Chapter 5.2.2 --- Test Procedure --- p.81 === Chapter 5.2.3 --- Result --- p.83 === Chapter 5.2.4 --- Analysis --- p.86 === Chapter 5.3 --- Summary --- p.88 === Chapter 6 --- Simplified Articulatory Control for the Synthesizer Network --- p.89 === Chapter 6.1 --- Coarticulatory Effect in Speech Production --- p.90 === Chapter 6.1.1 --- Acoustic Effect --- p.90 === Chapter 6.1.2 --- Prosodic Effect --- p.91 === Chapter 6.2 --- Control in various Synthesis Techniques --- p.92 === Chapter 6.2.1 --- Copy Concatenation --- p.92 === Chapter 6.2.2 --- Formant Synthesis --- p.93 === Chapter 6.2.3 --- Articulatory synthesis --- p.93 === Chapter 6.3 --- Articulatory Control Model based on Vowel Quad --- p.94 === Chapter 6.3.1 --- Modeling of Variations with the Articulatory Control Model --- p.95 === Chapter 6.4 --- Voice Correspondence : --- p.97 === Chapter 6.4.1 --- For Nasal Sounds ´ؤ Inter-Network Correspondence --- p.98 === Chapter 6.4.2 --- In Flat-Tongue Space - Intra-Network Correspondence --- p.101 === Chapter 6.5 --- Summary --- p.108 === Chapter 7 --- Pause Duration Properties in Cantonese Phrases --- p.109 === Chapter 7.1 --- The Prosodic Feature - Inter-Syllable Pause --- p.110 === Chapter 7.2 --- Experiment for Measuring Inter-Syllable Pause of Cantonese Phrases --- p.111 === Chapter 7.2.1 --- Speech Material Selection --- p.111 === Chapter 7.2.2 --- Experimental Procedure --- p.112 === Chapter 7.2.3 --- Result --- p.114 === Chapter 7.3 --- Characteristics of Inter-Syllable Pause in Cantonese Phrases --- p.117 === Chapter 7.3.1 --- Pause Duration Characteristics for Initials after Pause --- p.117 === Chapter 7.3.2 --- Pause Duration Characteristic for Finals before Pause --- p.119 === Chapter 7.3.3 --- General Observations --- p.119 === Chapter 7.3.4 --- Other Observations --- p.121 === Chapter 7.4 --- Application of Pause-duration Statistics to the Synthesis System --- p.124 === Chapter 7.5 --- Summary --- p.126 === Chapter 8 --- Conclusion and Further Work --- p.127 === Chapter 8.1 --- Conclusion --- p.127 === Chapter 8.2 --- Further Extension Work --- p.130 === Chapter 8.2.1 --- Regularization Network Optimized on ISD --- p.130 === Chapter 8.2.2 --- Incorporation of Non-Articulatory Parameters to Control Space --- p.130 === Chapter 8.2.3 --- Experiment on Other Prosodic Features --- p.131 === Chapter 8.2.4 --- Application of Voice Correspondence to Cantonese Coda Discrim- ination --- p.131 === Chapter A --- Cantonese Initials and Finals --- p.132 === Chapter A.1 --- Tables of All Cantonese Initials and Finals --- p.132 === Chapter B --- Using Distortion Measure as Error Function in Neural Network --- p.135 === Chapter B.1 --- Formulation of Itakura-Saito Distortion Measure for Neural Network Error Function --- p.135 === Chapter B.2 --- Formulation of a Modified Itakura-Saito Distortion (MISD) Measure for Neural Network Error Function --- p.137 === Chapter C --- Orthogonal Least Square Algorithm for RBFNet Training --- p.138 === Chapter C.l --- Orthogonal Least Squares Learning Algorithm for Radial Basis Function Network Training --- p.138 === Chapter D --- Phrase Lists --- p.140 === Chapter D.1 --- Two-Syllable Phrase List for the Pause Duration Experiment --- p.140 === Chapter D.1.1 --- 兩字詞 --- p.140 === Chapter D.2 --- Three/Four-Syllable Phrase List for the Pause Duration Experiment --- p.144 === Chapter D.2.1 --- 片語 --- p.144