Automatic Recognition of Life Sounds

碩士 === 國立清華大學 === 電機工程學系 === 100 === There are many kinds of different sounds in human daily lives. Whether it is speech or non-speech, we can recognize the sounds by characteristic sounds through the human ears and realize what is happening around us. With technical advances, the identification of...

Full description

Bibliographic Details
Main Authors:	Wu, Chen Wei, 吳晨瑋
Other Authors:	Liu, Yi Wen
Format:	Others
Language:	zh-TW
Published:	2012
Online Access:	http://ndltd.ncl.edu.tw/handle/16103160345738752230

id	ndltd-TW-100NTHU5442108
record_format	oai_dc
spelling	ndltd-TW-100NTHU54421082015-10-13T21:27:24Z http://ndltd.ncl.edu.tw/handle/16103160345738752230 Automatic Recognition of Life Sounds 生活聲響之自動辨認 Wu, Chen Wei 吳晨瑋碩士國立清華大學電機工程學系 100 There are many kinds of different sounds in human daily lives. Whether it is speech or non-speech, we can recognize the sounds by characteristic sounds through the human ears and realize what is happening around us. With technical advances, the identification of the sound has become a practical technology gradually, especially in the speech recognition. The recognition of sound has gradually got into home safety. Regardless of the user's age or status, emergency can happen at home, accompanied by non-speech sounds. In the past, the recognition of the sound mostly focused on the voice and the speaker. If it is possible to classify and recognize any sound that indicates dangerous situations in the house, that will help analyze the scenario and increase people’s sense of security while living alone. In this paper, we have collected eight classes of audio files, 372 files in total for experiments. The files were equally divided into training and testing datasets. We use them to develop methods for sound recognition in normal or noisy situations. As for feature extraction, the feature vector consists of Mel-scale Frequency Cepstral Coefficients (MFCC) and Perceptual Features. Gaussian mixture model (GMM) is used as the front-end in the classifier, and an outlier rejection mechanism is added to it. The outlier rejection mechanism is based on Likelihood Ratio Test (LRT), which compares the test audio files and non-dataset files respectively with dataset. That way, we can prevent the non-dataset audio files from being enforced to recognize by mistake. In this paper, we use three methods to classify the audio files: the variance-mean method, the frame-vote method, and the selected frame-vote method. At the present time for the comparison of the dataset and the test audio files, the methods can reach 96.24% of recognition accuracy at best in the normal situation. In addition, we make a complete evaluation for the robustness against noise and echoes. As for the outlier rejection mechanism, we have collected a total of 120 non-dataset audio files to experiment on it, and the overall error rate can be reduced to 19%. What is more, we found a total of 100 non-dataset audio files to experiment on it again, and the overall error rate can be reduced to 23%. Liu, Yi Wen 劉奕汶 2012 學位論文 ; thesis 73 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立清華大學 === 電機工程學系 === 100 === There are many kinds of different sounds in human daily lives. Whether it is speech or non-speech, we can recognize the sounds by characteristic sounds through the human ears and realize what is happening around us. With technical advances, the identification of the sound has become a practical technology gradually, especially in the speech recognition. The recognition of sound has gradually got into home safety. Regardless of the user's age or status, emergency can happen at home, accompanied by non-speech sounds. In the past, the recognition of the sound mostly focused on the voice and the speaker. If it is possible to classify and recognize any sound that indicates dangerous situations in the house, that will help analyze the scenario and increase people’s sense of security while living alone. In this paper, we have collected eight classes of audio files, 372 files in total for experiments. The files were equally divided into training and testing datasets. We use them to develop methods for sound recognition in normal or noisy situations. As for feature extraction, the feature vector consists of Mel-scale Frequency Cepstral Coefficients (MFCC) and Perceptual Features. Gaussian mixture model (GMM) is used as the front-end in the classifier, and an outlier rejection mechanism is added to it. The outlier rejection mechanism is based on Likelihood Ratio Test (LRT), which compares the test audio files and non-dataset files respectively with dataset. That way, we can prevent the non-dataset audio files from being enforced to recognize by mistake. In this paper, we use three methods to classify the audio files: the variance-mean method, the frame-vote method, and the selected frame-vote method. At the present time for the comparison of the dataset and the test audio files, the methods can reach 96.24% of recognition accuracy at best in the normal situation. In addition, we make a complete evaluation for the robustness against noise and echoes. As for the outlier rejection mechanism, we have collected a total of 120 non-dataset audio files to experiment on it, and the overall error rate can be reduced to 19%. What is more, we found a total of 100 non-dataset audio files to experiment on it again, and the overall error rate can be reduced to 23%.
author2	Liu, Yi Wen
author_facet	Liu, Yi Wen Wu, Chen Wei 吳晨瑋
author	Wu, Chen Wei 吳晨瑋
spellingShingle	Wu, Chen Wei 吳晨瑋 Automatic Recognition of Life Sounds
author_sort	Wu, Chen Wei
title	Automatic Recognition of Life Sounds
title_short	Automatic Recognition of Life Sounds
title_full	Automatic Recognition of Life Sounds
title_fullStr	Automatic Recognition of Life Sounds
title_full_unstemmed	Automatic Recognition of Life Sounds
title_sort	automatic recognition of life sounds
publishDate	2012
url	http://ndltd.ncl.edu.tw/handle/16103160345738752230
work_keys_str_mv	AT wuchenwei automaticrecognitionoflifesounds AT wúchénwěi automaticrecognitionoflifesounds AT wuchenwei shēnghuóshēngxiǎngzhīzìdòngbiànrèn AT wúchénwěi shēnghuóshēngxiǎngzhīzìdòngbiànrèn
_version_	1718063430580043776

Automatic Recognition of Life Sounds

Similar Items