Summary: | 碩士 === 國立成功大學 === 資訊工程研究所 === 88 === In many applications of speech recognition, the recognition performance is degraded in presence of surrounding noise due to the mismatch between speech hidden Markov models (HMM’s) and testing utterances. In this study, we propose the transformation-based Bayesian predictive classification (TBPC) to improve the noisy speech recognition.
To compensate the mismatch in noisy environments, we transform the mean vectors of HMM’s by adding a bias vector. The uncertainty of bias vector is adequately represented by a Gaussian probability density function (pdf). The Bayesian predictive classification is intended to incorporate the uncertainty of parameters into the decision criterion. Herein, we develop the robust decision rule of TBPC for noisy speech recognition by combining the prior pdf of transformation parameters and the theory of Bayesian predictive classification.
Another important characteristic of the thesis is the online prior evolution (OPE). We can continuously update the hyperparameters of prior pdf from the incremental observed testing utterances. For this reason, the hyperparameters of prior pdf can trace the newest environmental statistics. The newest hyperparameters of prior pdf can be applied to TBPC decision and a recognizer equipped with robust decision and online prior evoluation can be built.
For the sake of verifying our theory, we recorded the noisy speech in hands-free car environments driving with speed of 90, 50, 0 (stand by) km/hour. In our experiments, the recognition performance is obviously improved at each speed.
|