Performance Evaluation and Improvement of Speaker Recognition under Multiple Environments

碩士 === 國立臺北科技大學 === 電機工程系碩士班 === 91 === With the rapidly growing number of mobile phone users, the promising market of mobile commerce is receiving much attention. However, how to verify customer’s identity is an issue of the first importance to such service. Among all biometrics, human speech has b...

Full description

Bibliographic Details
Main Authors: Shih-Wei Chang, 張世維
Other Authors: 譚旦旭
Format: Others
Language:zh-TW
Published: 2003
Online Access:http://ndltd.ncl.edu.tw/handle/80596537764174709678
Description
Summary:碩士 === 國立臺北科技大學 === 電機工程系碩士班 === 91 === With the rapidly growing number of mobile phone users, the promising market of mobile commerce is receiving much attention. However, how to verify customer’s identity is an issue of the first importance to such service. Among all biometrics, human speech has been recognized as one of the most convenient features for verifying personal identification. Therefore, the main purpose of this study is to evaluate and improve the performance of speaker recognition over real GSM environment. In addition, we also investigate the speaker recognition performance under diverse noisy environments. For performance evaluation, an GMM-based text-independent speaker recognition system is implemented by employing the Mel-Frequency Cepstral Coefficients (MFCCs) extracted from artificial synthesized speech and real GSM speech. Evaluation results obtained from baseline experiments demonstrate that environmental mismatch is a major factor that significantly degrades recognition accuracy. To overcome this difficulty, we first investigate the applicability of several popular speech recognition compensation schemes to speaker recognition. Experimental results show that only limited improvement is obtained. For further alleviating the adverse effects of mismatch, two novel strategies based on multi-environment model selection are proposed, which include (1) speech adaptive detection (SAD) based model selection scheme, and (2) integrated model based selection scheme. The results indicate that the first approach can yield recognition accuracy comparable to the matched conditions with a considerable reduction in computational complexity. On the other hand, the second approach is more robust to unknown noisy conditions and its accuracy is very close to the first one.