A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition

碩士 === 大同大學 === 資訊工程學系(所) === 97 === There are many ways for humans to express their emotion, for instance, speech, attitude or writing. Human speech involves not only the syntax but also the feeling at the moment of speaking. Thus, emotions play an important role for speech communication and recogn...

Full description

Bibliographic Details
Main Authors:	Ching-yi Lin, 林靜宜
Other Authors:	Tsang-long Pao
Format:	Others
Language:	en_US
Published:	2009
Online Access:	http://ndltd.ncl.edu.tw/handle/77453967950782385836

id	ndltd-TW-097TTU05392048
record_format	oai_dc
spelling	ndltd-TW-097TTU053920482016-05-02T04:11:11Z http://ndltd.ncl.edu.tw/handle/77453967950782385836 A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition 語音情緒辨識最佳參數之研究 Ching-yi Lin 林靜宜碩士大同大學資訊工程學系(所) 97 There are many ways for humans to express their emotion, for instance, speech, attitude or writing. Human speech involves not only the syntax but also the feeling at the moment of speaking. Thus, emotions play an important role for speech communication and recognizing human emotion in speech signal has attracted quite a lot of attention. In emotion recognition, different classifiers and features used in the system will influence the recognition rate. The purpose of this study is to acquire the most effective feature set for a specific classifier used in the speech emotion recognition. There are three main focuses: classifiers, emotion corpus combinations and the features to be analyzed. In this thesis we use 78 speech features, including Formant, Shimmer, Jitter, Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC), Mel-Frequency Cepstral Coefficients (MFCC), first derivative of MFCC (D-MFCC), second derivative of MFCC (DD-MFCC), Log Frequency Power Coefficients (LFPC), Perceptual Linear Prediction (PLP), RelAtive SpecTrAl PLP (RastaPLP), Log-Energy, Zero Crossing Rate (ZCR), as well as their mean, standard deviation, minimum, maximum and range, are extracted. The method that we analyze the effects of features is called sequential forward selection (SFS). Experiment results indicate that the most effective feature set for five emotions using WD-KNN can obtain the highest recognition accuracy of 90% with 13 features. From the results, we can see that the most effective feature among all extracted features for emotion recognition is Linear Predictive Coefficients (LPC). It appears in the most effective features for all the classifiers tested. Tsang-long Pao 包蒼龍 2009 學位論文 ; thesis 65 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 大同大學 === 資訊工程學系(所) === 97 === There are many ways for humans to express their emotion, for instance, speech, attitude or writing. Human speech involves not only the syntax but also the feeling at the moment of speaking. Thus, emotions play an important role for speech communication and recognizing human emotion in speech signal has attracted quite a lot of attention. In emotion recognition, different classifiers and features used in the system will influence the recognition rate. The purpose of this study is to acquire the most effective feature set for a specific classifier used in the speech emotion recognition. There are three main focuses: classifiers, emotion corpus combinations and the features to be analyzed. In this thesis we use 78 speech features, including Formant, Shimmer, Jitter, Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC), Mel-Frequency Cepstral Coefficients (MFCC), first derivative of MFCC (D-MFCC), second derivative of MFCC (DD-MFCC), Log Frequency Power Coefficients (LFPC), Perceptual Linear Prediction (PLP), RelAtive SpecTrAl PLP (RastaPLP), Log-Energy, Zero Crossing Rate (ZCR), as well as their mean, standard deviation, minimum, maximum and range, are extracted. The method that we analyze the effects of features is called sequential forward selection (SFS). Experiment results indicate that the most effective feature set for five emotions using WD-KNN can obtain the highest recognition accuracy of 90% with 13 features. From the results, we can see that the most effective feature among all extracted features for emotion recognition is Linear Predictive Coefficients (LPC). It appears in the most effective features for all the classifiers tested.
author2	Tsang-long Pao
author_facet	Tsang-long Pao Ching-yi Lin 林靜宜
author	Ching-yi Lin 林靜宜
spellingShingle	Ching-yi Lin 林靜宜 A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition
author_sort	Ching-yi Lin
title	A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition
title_short	A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition
title_full	A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition
title_fullStr	A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition
title_full_unstemmed	A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition
title_sort	study on identifying the most effective speech features for speech emotion recognition
publishDate	2009
url	http://ndltd.ncl.edu.tw/handle/77453967950782385836
work_keys_str_mv	AT chingyilin astudyonidentifyingthemosteffectivespeechfeaturesforspeechemotionrecognition AT línjìngyí astudyonidentifyingthemosteffectivespeechfeaturesforspeechemotionrecognition AT chingyilin yǔyīnqíngxùbiànshízuìjiācānshùzhīyánjiū AT línjìngyí yǔyīnqíngxùbiànshízuìjiācānshùzhīyánjiū AT chingyilin studyonidentifyingthemosteffectivespeechfeaturesforspeechemotionrecognition AT línjìngyí studyonidentifyingthemosteffectivespeechfeaturesforspeechemotionrecognition
_version_	1718253440968163328

A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition

Similar Items