Environmental Sound Event Classification Based on Modulation Spectral Vectors

碩士 === 國立清華大學 === 電機工程學系所 === 105 === The Gaussian mixture model (GMM) has developed well both in the speech and sound recognition, but it does not perform well in the high background noisy environment. This thesis proposes a method combining short-term and long-term features to overcome this issue....

Full description

Bibliographic Details
Main Authors: Liao, Jyun-Ci, 廖俊祺
Other Authors: Liu, Yi-Wen
Format: Others
Language:zh-TW
Published: 2017
Online Access:http://ndltd.ncl.edu.tw/handle/77th6k
id ndltd-TW-105NTHU5441070
record_format oai_dc
spelling ndltd-TW-105NTHU54410702019-05-16T00:00:23Z http://ndltd.ncl.edu.tw/handle/77th6k Environmental Sound Event Classification Based on Modulation Spectral Vectors 基於調制頻譜向量之環境聲響事件分類 Liao, Jyun-Ci 廖俊祺 碩士 國立清華大學 電機工程學系所 105 The Gaussian mixture model (GMM) has developed well both in the speech and sound recognition, but it does not perform well in the high background noisy environment. This thesis proposes a method combining short-term and long-term features to overcome this issue. Here the short-term features are Mel-frequency cepstral coefficients (MFCCs) and the long-term features are the modulation spectral vectors (MSVs) calculated in the frequency domain. The MSVs contains the envelope message of signals which is a good feature against high noise. For robustness against noise, this thesis proposes a method to learn noisy data while training on GMMs. This method could raise the recognition accuracy in the low singal-to-noise ratio (SNR) case. The method was evaluated on a database which consists of 8 different indoor sound event classes. It achieves > 80 % accuracy at 0 dB SNR. Liu, Yi-Wen 劉奕汶 2017 學位論文 ; thesis 48 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立清華大學 === 電機工程學系所 === 105 === The Gaussian mixture model (GMM) has developed well both in the speech and sound recognition, but it does not perform well in the high background noisy environment. This thesis proposes a method combining short-term and long-term features to overcome this issue. Here the short-term features are Mel-frequency cepstral coefficients (MFCCs) and the long-term features are the modulation spectral vectors (MSVs) calculated in the frequency domain. The MSVs contains the envelope message of signals which is a good feature against high noise. For robustness against noise, this thesis proposes a method to learn noisy data while training on GMMs. This method could raise the recognition accuracy in the low singal-to-noise ratio (SNR) case. The method was evaluated on a database which consists of 8 different indoor sound event classes. It achieves > 80 % accuracy at 0 dB SNR.
author2 Liu, Yi-Wen
author_facet Liu, Yi-Wen
Liao, Jyun-Ci
廖俊祺
author Liao, Jyun-Ci
廖俊祺
spellingShingle Liao, Jyun-Ci
廖俊祺
Environmental Sound Event Classification Based on Modulation Spectral Vectors
author_sort Liao, Jyun-Ci
title Environmental Sound Event Classification Based on Modulation Spectral Vectors
title_short Environmental Sound Event Classification Based on Modulation Spectral Vectors
title_full Environmental Sound Event Classification Based on Modulation Spectral Vectors
title_fullStr Environmental Sound Event Classification Based on Modulation Spectral Vectors
title_full_unstemmed Environmental Sound Event Classification Based on Modulation Spectral Vectors
title_sort environmental sound event classification based on modulation spectral vectors
publishDate 2017
url http://ndltd.ncl.edu.tw/handle/77th6k
work_keys_str_mv AT liaojyunci environmentalsoundeventclassificationbasedonmodulationspectralvectors
AT liàojùnqí environmentalsoundeventclassificationbasedonmodulationspectralvectors
AT liaojyunci jīyúdiàozhìpínpǔxiàngliàngzhīhuánjìngshēngxiǎngshìjiànfēnlèi
AT liàojùnqí jīyúdiàozhìpínpǔxiàngliàngzhīhuánjìngshēngxiǎngshìjiànfēnlèi
_version_ 1719157942786195456