Environmental Sound Event Classification Based on Modulation Spectral Vectors

碩士 === 國立清華大學 === 電機工程學系所 === 105 === The Gaussian mixture model (GMM) has developed well both in the speech and sound recognition, but it does not perform well in the high background noisy environment. This thesis proposes a method combining short-term and long-term features to overcome this issue....

Full description

Bibliographic Details
Main Authors: Liao, Jyun-Ci, 廖俊祺
Other Authors: Liu, Yi-Wen
Format: Others
Language:zh-TW
Published: 2017
Online Access:http://ndltd.ncl.edu.tw/handle/77th6k
Description
Summary:碩士 === 國立清華大學 === 電機工程學系所 === 105 === The Gaussian mixture model (GMM) has developed well both in the speech and sound recognition, but it does not perform well in the high background noisy environment. This thesis proposes a method combining short-term and long-term features to overcome this issue. Here the short-term features are Mel-frequency cepstral coefficients (MFCCs) and the long-term features are the modulation spectral vectors (MSVs) calculated in the frequency domain. The MSVs contains the envelope message of signals which is a good feature against high noise. For robustness against noise, this thesis proposes a method to learn noisy data while training on GMMs. This method could raise the recognition accuracy in the low singal-to-noise ratio (SNR) case. The method was evaluated on a database which consists of 8 different indoor sound event classes. It achieves > 80 % accuracy at 0 dB SNR.