Audio features dedicated to the detection and tracking of arousal and valence in musical compositions

The aim of this paper was to discover what combination of audio features gives the best performance with music emotion detection. Emotion recognition was treated as a regression problem, and a two-dimensional valence–arousal model was used to measure emotions in music. Features extracted by Essentia...

Full description

Bibliographic Details
Main Author: Jacek Grekow
Format: Article
Language:English
Published: Taylor & Francis Group 2018-07-01
Series:Journal of Information and Telecommunication
Subjects:
Online Access:http://dx.doi.org/10.1080/24751839.2018.1463749
id doaj-f283c7a829a84d90a2f7d4f968d31126
record_format Article
spelling doaj-f283c7a829a84d90a2f7d4f968d311262020-11-25T00:40:27ZengTaylor & Francis GroupJournal of Information and Telecommunication2475-18392475-18472018-07-012332233310.1080/24751839.2018.14637491463749Audio features dedicated to the detection and tracking of arousal and valence in musical compositionsJacek Grekow0Bialystok University of TechnologyThe aim of this paper was to discover what combination of audio features gives the best performance with music emotion detection. Emotion recognition was treated as a regression problem, and a two-dimensional valence–arousal model was used to measure emotions in music. Features extracted by Essentia and Marsyas, tools for audio analysis and audio-based music information retrieval, were used. The influence of different feature sets was examined – low level, rhythm, tonal, and their combination – on arousal and valence prediction. The use of a combination of different types of features significantly improves the results compared with using just one group of features. Features particularly dedicated to the detection of arousal and valence separately, as well as features useful in both cases, were found and presented. This paper presents also the process of building emotion maps of musical compositions. The obtained emotion maps provide new knowledge about the distribution of emotions in an examined audio recording. They reveal new knowledge that had only been available to music experts until this point.http://dx.doi.org/10.1080/24751839.2018.1463749Music emotion detectionaudio featuresfeature selectionemotion tracking
collection DOAJ
language English
format Article
sources DOAJ
author Jacek Grekow
spellingShingle Jacek Grekow
Audio features dedicated to the detection and tracking of arousal and valence in musical compositions
Journal of Information and Telecommunication
Music emotion detection
audio features
feature selection
emotion tracking
author_facet Jacek Grekow
author_sort Jacek Grekow
title Audio features dedicated to the detection and tracking of arousal and valence in musical compositions
title_short Audio features dedicated to the detection and tracking of arousal and valence in musical compositions
title_full Audio features dedicated to the detection and tracking of arousal and valence in musical compositions
title_fullStr Audio features dedicated to the detection and tracking of arousal and valence in musical compositions
title_full_unstemmed Audio features dedicated to the detection and tracking of arousal and valence in musical compositions
title_sort audio features dedicated to the detection and tracking of arousal and valence in musical compositions
publisher Taylor & Francis Group
series Journal of Information and Telecommunication
issn 2475-1839
2475-1847
publishDate 2018-07-01
description The aim of this paper was to discover what combination of audio features gives the best performance with music emotion detection. Emotion recognition was treated as a regression problem, and a two-dimensional valence–arousal model was used to measure emotions in music. Features extracted by Essentia and Marsyas, tools for audio analysis and audio-based music information retrieval, were used. The influence of different feature sets was examined – low level, rhythm, tonal, and their combination – on arousal and valence prediction. The use of a combination of different types of features significantly improves the results compared with using just one group of features. Features particularly dedicated to the detection of arousal and valence separately, as well as features useful in both cases, were found and presented. This paper presents also the process of building emotion maps of musical compositions. The obtained emotion maps provide new knowledge about the distribution of emotions in an examined audio recording. They reveal new knowledge that had only been available to music experts until this point.
topic Music emotion detection
audio features
feature selection
emotion tracking
url http://dx.doi.org/10.1080/24751839.2018.1463749
work_keys_str_mv AT jacekgrekow audiofeaturesdedicatedtothedetectionandtrackingofarousalandvalenceinmusicalcompositions
_version_ 1725290137478234112