Summary: | In this paper, authors tried to develop reduced combinational features for emotional speech recognition. The spectral/cepstral features like wavelet coefficient, LPCC (linear prediction cepstral coefficient) and MFCC (mel-frequency cepstral coefficient) are used as baseline features. The frequency variation with respect to time has been evaluated properly using wavelet coefficients. Again, MFCC and LPCC features are derived from wavelets, so that the exact information of the emotion has been fetched and are used as features. The features are reduced with vector quantization method and used in radial basis function network (RBFNN) classifier. The feature sets are also combined and tested in the classifier. This piece of work deals with five emotions as angry, fear, happy, disgust and neutral. These are tested for Berlin (EMO-DB) and Surrey Audio-Visual Expressed Emotion (SAVEE) database. The proposed frequency based decomposition and combination choice of features show excellent result and it is exhibited in result section. Keywords: Mel-frequency cepstral coefficient, Linear prediction cepstral coefficient, Wavelet, Feature combination, Classification
|