Summary: | 碩士 === 淡江大學 === 電機工程學系 === 85 === This paper focuses on the speaker identification system, and has a detailed introduction about the speech feature. This system uses fuzzy theory, neural networks and genetic algorithm as the recognition structure.
In text-dependent speaker identification, the author just used the back-propagation neural network as the recognition scheme and trained the personal neural networks using genetic algorithm. From the results of experiment, we find that the features which combine static cepstrum with dynamic spectrum are better. The recognition rate decreases as the number of speaker increases. But total recognition rate is more than 98% (test segment is 0.8 second). It shows that this system has high potential for further research.
In text-independent speaker identification, this paper proposed a two-stage recognition structure. First, the zero-crossing rate and the first formant average value of spectrum envelope (Vi) are used as the feature. Then the speech data uses the distributed fuzzy rules to delete silence and consonants. The distributed fuzzy rules don''t need much training data and have good performance inclustering problems. Thus it is beneficial for clustering the phonemic features of speech data even though there is less training data. In order to achieve practial system, the author uses genetic algorithm to select fuzzy rules and delete the unnecessary fuzzy rules. The experimental results show that the distributed fuzzy rules cluster the speech data efficiently and are adaptive to independent speakers. On the other hand, genetic algorithm can delete plenty fuzzy rules which decrease the recognition rate less than 1%. The system deletes silence and unstable consonants in continuous speech, and it makes neural networks operate rapidly and more representable. From the results of experiment, we find that the total recognition rate can be improved. Beside, this paper proposed the two-stagerecognition structure which makes the speaker identification automatically.
|