A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition
碩士 === 大同大學 === 資訊工程學系(所) === 97 === There are many ways for humans to express their emotion, for instance, speech, attitude or writing. Human speech involves not only the syntax but also the feeling at the moment of speaking. Thus, emotions play an important role for speech communication and recogn...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2009
|
Online Access: | http://ndltd.ncl.edu.tw/handle/77453967950782385836 |
id |
ndltd-TW-097TTU05392048 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-097TTU053920482016-05-02T04:11:11Z http://ndltd.ncl.edu.tw/handle/77453967950782385836 A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition 語音情緒辨識最佳參數之研究 Ching-yi Lin 林靜宜 碩士 大同大學 資訊工程學系(所) 97 There are many ways for humans to express their emotion, for instance, speech, attitude or writing. Human speech involves not only the syntax but also the feeling at the moment of speaking. Thus, emotions play an important role for speech communication and recognizing human emotion in speech signal has attracted quite a lot of attention. In emotion recognition, different classifiers and features used in the system will influence the recognition rate. The purpose of this study is to acquire the most effective feature set for a specific classifier used in the speech emotion recognition. There are three main focuses: classifiers, emotion corpus combinations and the features to be analyzed. In this thesis we use 78 speech features, including Formant, Shimmer, Jitter, Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC), Mel-Frequency Cepstral Coefficients (MFCC), first derivative of MFCC (D-MFCC), second derivative of MFCC (DD-MFCC), Log Frequency Power Coefficients (LFPC), Perceptual Linear Prediction (PLP), RelAtive SpecTrAl PLP (RastaPLP), Log-Energy, Zero Crossing Rate (ZCR), as well as their mean, standard deviation, minimum, maximum and range, are extracted. The method that we analyze the effects of features is called sequential forward selection (SFS). Experiment results indicate that the most effective feature set for five emotions using WD-KNN can obtain the highest recognition accuracy of 90% with 13 features. From the results, we can see that the most effective feature among all extracted features for emotion recognition is Linear Predictive Coefficients (LPC). It appears in the most effective features for all the classifiers tested. Tsang-long Pao 包蒼龍 2009 學位論文 ; thesis 65 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 大同大學 === 資訊工程學系(所) === 97 === There are many ways for humans to express their emotion, for instance, speech, attitude or writing. Human speech involves not only the syntax but also the feeling at the moment of speaking. Thus, emotions play an important role for speech communication and recognizing human emotion in speech signal has attracted quite a lot of attention.
In emotion recognition, different classifiers and features used in the system will influence the recognition rate. The purpose of this study is to acquire the most effective feature set for a specific classifier used in the speech emotion recognition. There are three main focuses: classifiers, emotion corpus combinations and the features to be analyzed. In this thesis we use 78 speech features, including Formant, Shimmer, Jitter, Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC), Mel-Frequency Cepstral Coefficients (MFCC), first derivative of MFCC (D-MFCC), second derivative of MFCC (DD-MFCC), Log Frequency Power Coefficients (LFPC), Perceptual Linear Prediction (PLP), RelAtive SpecTrAl PLP (RastaPLP), Log-Energy, Zero Crossing Rate (ZCR), as well as their mean, standard deviation, minimum, maximum and range, are extracted. The method that we analyze the effects of features is called sequential forward selection (SFS).
Experiment results indicate that the most effective feature set for five emotions using WD-KNN can obtain the highest recognition accuracy of 90% with 13 features. From the results, we can see that the most effective feature among all extracted features for emotion recognition is Linear Predictive Coefficients (LPC). It appears in the most effective features for all the classifiers tested.
|
author2 |
Tsang-long Pao |
author_facet |
Tsang-long Pao Ching-yi Lin 林靜宜 |
author |
Ching-yi Lin 林靜宜 |
spellingShingle |
Ching-yi Lin 林靜宜 A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition |
author_sort |
Ching-yi Lin |
title |
A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition |
title_short |
A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition |
title_full |
A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition |
title_fullStr |
A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition |
title_full_unstemmed |
A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition |
title_sort |
study on identifying the most effective speech features for speech emotion recognition |
publishDate |
2009 |
url |
http://ndltd.ncl.edu.tw/handle/77453967950782385836 |
work_keys_str_mv |
AT chingyilin astudyonidentifyingthemosteffectivespeechfeaturesforspeechemotionrecognition AT línjìngyí astudyonidentifyingthemosteffectivespeechfeaturesforspeechemotionrecognition AT chingyilin yǔyīnqíngxùbiànshízuìjiācānshùzhīyánjiū AT línjìngyí yǔyīnqíngxùbiànshízuìjiācānshùzhīyánjiū AT chingyilin studyonidentifyingthemosteffectivespeechfeaturesforspeechemotionrecognition AT línjìngyí studyonidentifyingthemosteffectivespeechfeaturesforspeechemotionrecognition |
_version_ |
1718253440968163328 |