Summary: | 碩士 === 國立臺北科技大學 === 電機工程系碩士班 === 91 === With the rapidly growing number of mobile phone users, the promising market of mobile commerce is receiving much attention. However, how to verify customer’s identity is an issue of the first importance to such service. Among all biometrics, human speech has been recognized as one of the most convenient features for verifying personal identification. Therefore, the main purpose of this study is to evaluate and improve the performance of speaker recognition over real GSM environment. In addition, we also investigate the speaker recognition performance under diverse noisy environments.
For performance evaluation, an GMM-based text-independent speaker recognition system is implemented by employing the Mel-Frequency Cepstral Coefficients (MFCCs) extracted from artificial synthesized speech and real GSM speech. Evaluation results obtained from baseline experiments demonstrate that environmental mismatch is a major factor that significantly degrades recognition accuracy. To overcome this difficulty, we first investigate the applicability of several popular speech recognition compensation schemes to speaker recognition. Experimental results show that only limited improvement is obtained. For further alleviating the adverse effects of mismatch, two novel strategies based on multi-environment model selection are proposed, which include (1) speech adaptive detection (SAD) based model selection scheme, and (2) integrated model based selection scheme. The results indicate that the first approach can yield recognition accuracy comparable to the matched conditions with a considerable reduction in computational complexity. On the other hand, the second approach is more robust to unknown noisy conditions and its accuracy is very close to the first one.
|