Summary: | 碩士 === 國立交通大學 === 電信工程系所 === 96 === In this study, a joint spectro-temporal auditory model was utilized to assess speech quality objectively. In this model, the first stage is to mimic early cochlear functions of the spectrum estimation and the second stage is to mimic cortical functions of the multi-dimensional spectrum analysis. The goal of this study is to predict subjective mean opinion score (MOS).
Objective speech quality assessment can be done by two methods:intrusive and non-intrusive. In this study, firstly, we observe and analyze patterns of the clean speech, the noisy speech with different background noise, and the degraded speech through different codecs at two auditory stages. Secondly, we will derive an objective estimate of the MOS from data-driven perceptual parameters which are believed to reflect people’s judgment on speech quality. Four perceptual parameters considered are intelligibility, naturalness, and pitch distortion. Finally, we use multiple regression analysis to combine the relationship between speech quality and these perceptual parameters, and then obtain our predicted MOS. We then demonstrate the MOS can be characterized quickly and reliably by these three perceptual features.
|