Summary: | 碩士 === 國立交通大學 === 資訊工程系 === 90 === This paper mainly discusses the learning of Gaussian Mixture Model and its application on speaker identification. In the previous studies, it has been shown that using GMM for speaker
identification would perform well. But they do not discuss deeply about the number of gaussian component of GMM and the type of covariance matrix(full or diagonal). In this paper, we propose a BIC-based self-growing learning method for GMM and determine the number of gaussian component of each GMM automatically. We also use full covariance matrix GMM and diagonal covariance matrix GMM for speaker identification separately and then compare their experiment result. Our speaker database include 19 anchor woman and 3 anchor man from mpeg files that we captured from TV news by capture card. Under this database, the GMM speaker identifier with full covariance attains 95.84% identification accuracy rate, and 97.90% accuracy rate with diagonal covariance matrix. In this
paper, we also use the GMM-based speaker identification method for TV-news anchor detection and news story segmentation. We use 7 hours of TV-news program as testing data, and in our experiment the precision rate attains 90.20% and the recall rate attains 92.5%。
|