Summary: | 碩士 === 國立交通大學 === 電信工程系 === 87 === Previous work on automatic Chinese-dialect identification using an acoustic-phonotactic model allows the system to differentiate three dialects from each other in a multi-speaker (MS) environment. However, as we extend the task to the speaker-independent (SI) mode, the well-trained identifier suffers from serious degradation due to the mismatch between the training and the testing conditions. In order to overcome this problem, several well-developed solutions such as CMS, spectral transform, MAP, and MLLR were used. However, the experimental results indicate that such speaker compensation schemes developed for speech recognition are less successful. We speculate that the use of speaker compensation may destroy the discriminability of acoustic-phonotactic model. Recognizing this, an acoustic-based VQ-distortion identifier together with codebook adaptation is developed to alleviate the speaker mismatch problem. Simulation results indicate that a VQ-distortion identifier can easily extend to SI system with little degradation.
|