Automatic recognition of isolated Cantonese syllables using neural networks =: 利用神經網絡識別粤語單音節.
by Tan Lee. === Thesis (Ph.D.)--Chinese University of Hong Kong, 1996. === Includes bibliographical references. === by Tan Lee. === Chapter 1 --- Introduction --- p.1 === Chapter 1.1 --- Conventional Pattern Recognition Approaches for Speech Recognition --- p.3 === Chapter 1.2 --- A Review on Neur...
Other Authors: | |
---|---|
Format: | Others |
Language: | English |
Published: |
Chinese University of Hong Kong
1996
|
Subjects: | |
Online Access: | http://library.cuhk.edu.hk/record=b5888875 http://repository.lib.cuhk.edu.hk/en/item/cuhk-321646 |
Summary: | by Tan Lee. === Thesis (Ph.D.)--Chinese University of Hong Kong, 1996. === Includes bibliographical references. === by Tan Lee. === Chapter 1 --- Introduction --- p.1 === Chapter 1.1 --- Conventional Pattern Recognition Approaches for Speech Recognition --- p.3 === Chapter 1.2 --- A Review on Neural Network Applications in Speech Recognition --- p.6 === Chapter 1.2.1 --- Static Pattern Classification --- p.7 === Chapter 1.2.2 --- Hybrid Approaches --- p.9 === Chapter 1.2.3 --- Dynamic Neural Networks --- p.12 === Chapter 1.3 --- Automatic Recognition of Cantonese Speech --- p.16 === Chapter 1.4 --- Organization of the Thesis --- p.18 === References --- p.20 === Chapter 2 --- Phonological and Acoustical Properties of Cantonese Syllables --- p.29 === Chapter 2.1 --- Phonology of Cantonese --- p.29 === Chapter 2.1.1 --- Basic Phonetic Units --- p.30 === Chapter 2.1.2 --- Syllabic Structure --- p.32 === Chapter 2.1.3 --- Lexical Tones --- p.33 === Chapter 2.2 --- Acoustical Properties of Cantonese Syllables --- p.35 === Chapter 2.2.1 --- Spectral Features --- p.35 === Chapter 2.2.2 --- Energy and Zero-Crossing Rate --- p.39 === Chapter 2.2.3 --- Pitch --- p.40 === Chapter 2.2.4 --- Duration --- p.41 === Chapter 2.3 --- Acoustic Feature Extraction for Speech Recognition of Cantonese --- p.42 === References --- p.46 === Chapter 3 --- Tone Recognition of Isolated Cantonese Syllables --- p.48 === Chapter 3.1 --- Acoustic Pre-processing --- p.48 === Chapter 3.1.1 --- Voiced Portion Detection --- p.48 === Chapter 3.1.2 --- Pitch Extraction --- p.51 === Chapter 3.2 --- Supra-Segmental Feature Parameters for Tone Recognition --- p.53 === Chapter 3.2.1 --- Pitch-Related Feature Parameters --- p.53 === Chapter 3.2.2 --- Duration and Energy Drop Rate --- p.55 === Chapter 3.2.3 --- Normalization of Feature Parameters --- p.57 === Chapter 3.3 --- An MLP Based Tone Classifier --- p.58 === Chapter 3.4 --- Simulation Experiments --- p.59 === Chapter 3.4.1 --- Speech Data --- p.59 === Chapter 3.4.2 --- Feature Extraction and Normalization --- p.61 === Chapter 3.4.3 --- Experimental Results --- p.61 === Chapter 3.5 --- Discussion and Conclusion --- p.64 === References --- p.65 === Chapter 4 --- Recurrent Neural Network Based Dynamic Speech Models --- p.67 === Chapter 4.1 --- Motivations and Rationales --- p.68 === Chapter 4.2 --- RNN Speech Model (RSM) --- p.71 === Chapter 4.2.1 --- Network Architecture and Dynamic Operation --- p.71 === Chapter 4.2.2 --- RNN for Speech Modeling --- p.72 === Chapter 4.2.3 --- Illustrative Examples --- p.75 === Chapter 4.3 --- Training of RNN Speech Models --- p.78 === Chapter 4.3.1 --- Real-Time-Recurrent-Learning (RTRL) Algorithm --- p.78 === Chapter 4.3.2 --- Iterative Re-segmentation Training of RSM --- p.80 === Chapter 4.4 --- Several Practical Issues in RSM Training --- p.85 === Chapter 4.4.1 --- Combining Adjacent Segments --- p.85 === Chapter 4.4.2 --- Hypothesizing Initial Segmentation --- p.86 === Chapter 4.4.3 --- Improving Temporal State Dependency --- p.89 === Chapter 4.5 --- Simulation Experiments --- p.90 === Chapter 4.5.1 --- Experiment 4.1 - Training with a Single Utterance --- p.91 === Chapter 4.5.2 --- Experiment 4.2 - Effect of Augmenting Recurrent Learning Rate --- p.93 === Chapter 4.5.3 --- Experiment 4.3 - Training with Multiple Utterances --- p.96 === Chapter 4.5.4 --- Experiment 4.4 一 Modeling Performance of RSMs --- p.99 === Chapter 4.6 --- Conclusion --- p.104 === References --- p.106 === Chapter 5 --- Isolated Word Recognition Using RNN Speech Models --- p.107 === Chapter 5.1 --- A Baseline System --- p.107 === Chapter 5.1.1 --- System Description --- p.107 === Chapter 5.1.2 --- Simulation Experiments --- p.110 === Chapter 5.1.3 --- Discussion --- p.117 === Chapter 5.2 --- Incorporating Duration Information --- p.118 === Chapter 5.2.1 --- Duration Screening --- p.118 === Chapter 5.2.2 --- Determination of Duration Bounds --- p.120 === Chapter 5.2.3 --- Simulation Experiments --- p.120 === Chapter 5.2.4 --- Discussion --- p.124 === Chapter 5.3 --- Discriminative Training --- p.125 === Chapter 5.3.1 --- The Minimum Classification Error Formulation --- p.126 === Chapter 5.3.2 --- Generalized Probabilistic Descent Algorithm --- p.127 === Chapter 5.3.3 --- Determination of Training Parameters --- p.128 === Chapter 5.3.4 --- Simulation Experiments --- p.129 === Chapter 5.3.5 --- Discussion --- p.133 === Chapter 5.4 --- Conclusion --- p.134 === References --- p.135 === Chapter 6 --- An Integrated Speech Recognition System for Cantonese Syllables --- p.137 === Chapter 6.1 --- System Architecture and Recognition Scheme --- p.137 === Chapter 6.2 --- Speech Corpus and Data Pre-processing --- p.140 === Chapter 6.3 --- Recognition Experiments and Results --- p.140 === Chapter 6.4 --- Discussion and Conclusion --- p.144 === References --- p.146 === Chapter 7 --- Conclusions and Suggestions for Future Work --- p.147 === Chapter 7.1 --- Conclusions --- p.147 === Chapter 7.2 --- Suggestions for Future Work --- p.151 |
---|