A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition

碩士 === 大同大學 === 資訊工程學系(所) === 97 === There are many ways for humans to express their emotion, for instance, speech, attitude or writing. Human speech involves not only the syntax but also the feeling at the moment of speaking. Thus, emotions play an important role for speech communication and recogn...

Full description

Bibliographic Details
Main Authors: Ching-yi Lin, 林靜宜
Other Authors: Tsang-long Pao
Format: Others
Language:en_US
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/77453967950782385836
id ndltd-TW-097TTU05392048
record_format oai_dc
spelling ndltd-TW-097TTU053920482016-05-02T04:11:11Z http://ndltd.ncl.edu.tw/handle/77453967950782385836 A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition 語音情緒辨識最佳參數之研究 Ching-yi Lin 林靜宜 碩士 大同大學 資訊工程學系(所) 97 There are many ways for humans to express their emotion, for instance, speech, attitude or writing. Human speech involves not only the syntax but also the feeling at the moment of speaking. Thus, emotions play an important role for speech communication and recognizing human emotion in speech signal has attracted quite a lot of attention. In emotion recognition, different classifiers and features used in the system will influence the recognition rate. The purpose of this study is to acquire the most effective feature set for a specific classifier used in the speech emotion recognition. There are three main focuses: classifiers, emotion corpus combinations and the features to be analyzed. In this thesis we use 78 speech features, including Formant, Shimmer, Jitter, Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC), Mel-Frequency Cepstral Coefficients (MFCC), first derivative of MFCC (D-MFCC), second derivative of MFCC (DD-MFCC), Log Frequency Power Coefficients (LFPC), Perceptual Linear Prediction (PLP), RelAtive SpecTrAl PLP (RastaPLP), Log-Energy, Zero Crossing Rate (ZCR), as well as their mean, standard deviation, minimum, maximum and range, are extracted. The method that we analyze the effects of features is called sequential forward selection (SFS). Experiment results indicate that the most effective feature set for five emotions using WD-KNN can obtain the highest recognition accuracy of 90% with 13 features. From the results, we can see that the most effective feature among all extracted features for emotion recognition is Linear Predictive Coefficients (LPC). It appears in the most effective features for all the classifiers tested. Tsang-long Pao 包蒼龍 2009 學位論文 ; thesis 65 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 大同大學 === 資訊工程學系(所) === 97 === There are many ways for humans to express their emotion, for instance, speech, attitude or writing. Human speech involves not only the syntax but also the feeling at the moment of speaking. Thus, emotions play an important role for speech communication and recognizing human emotion in speech signal has attracted quite a lot of attention. In emotion recognition, different classifiers and features used in the system will influence the recognition rate. The purpose of this study is to acquire the most effective feature set for a specific classifier used in the speech emotion recognition. There are three main focuses: classifiers, emotion corpus combinations and the features to be analyzed. In this thesis we use 78 speech features, including Formant, Shimmer, Jitter, Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC), Mel-Frequency Cepstral Coefficients (MFCC), first derivative of MFCC (D-MFCC), second derivative of MFCC (DD-MFCC), Log Frequency Power Coefficients (LFPC), Perceptual Linear Prediction (PLP), RelAtive SpecTrAl PLP (RastaPLP), Log-Energy, Zero Crossing Rate (ZCR), as well as their mean, standard deviation, minimum, maximum and range, are extracted. The method that we analyze the effects of features is called sequential forward selection (SFS). Experiment results indicate that the most effective feature set for five emotions using WD-KNN can obtain the highest recognition accuracy of 90% with 13 features. From the results, we can see that the most effective feature among all extracted features for emotion recognition is Linear Predictive Coefficients (LPC). It appears in the most effective features for all the classifiers tested.
author2 Tsang-long Pao
author_facet Tsang-long Pao
Ching-yi Lin
林靜宜
author Ching-yi Lin
林靜宜
spellingShingle Ching-yi Lin
林靜宜
A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition
author_sort Ching-yi Lin
title A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition
title_short A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition
title_full A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition
title_fullStr A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition
title_full_unstemmed A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition
title_sort study on identifying the most effective speech features for speech emotion recognition
publishDate 2009
url http://ndltd.ncl.edu.tw/handle/77453967950782385836
work_keys_str_mv AT chingyilin astudyonidentifyingthemosteffectivespeechfeaturesforspeechemotionrecognition
AT línjìngyí astudyonidentifyingthemosteffectivespeechfeaturesforspeechemotionrecognition
AT chingyilin yǔyīnqíngxùbiànshízuìjiācānshùzhīyánjiū
AT línjìngyí yǔyīnqíngxùbiànshízuìjiācānshùzhīyánjiū
AT chingyilin studyonidentifyingthemosteffectivespeechfeaturesforspeechemotionrecognition
AT línjìngyí studyonidentifyingthemosteffectivespeechfeaturesforspeechemotionrecognition
_version_ 1718253440968163328