Metrics for Polyphonic Sound Event Detection
This paper presents and discusses various metrics proposed for evaluation of polyphonic sound event detection systems used in realistic situations where there are typically multiple sound sources active simultaneously. The system output in this case contains overlapping events, marked as multiple so...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2016-05-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | http://www.mdpi.com/2076-3417/6/6/162 |
id |
doaj-75a0b3c4043c4f23a4dda15c5db850d0 |
---|---|
record_format |
Article |
spelling |
doaj-75a0b3c4043c4f23a4dda15c5db850d02020-11-24T21:41:41ZengMDPI AGApplied Sciences2076-34172016-05-016616210.3390/app6060162app6060162Metrics for Polyphonic Sound Event DetectionAnnamaria Mesaros0Toni Heittola1Tuomas Virtanen2Department of Signal Processing, Tampere University of Technology, P.O. Box 553, Tampere FI-33101, FinlandDepartment of Signal Processing, Tampere University of Technology, P.O. Box 553, Tampere FI-33101, FinlandDepartment of Signal Processing, Tampere University of Technology, P.O. Box 553, Tampere FI-33101, FinlandThis paper presents and discusses various metrics proposed for evaluation of polyphonic sound event detection systems used in realistic situations where there are typically multiple sound sources active simultaneously. The system output in this case contains overlapping events, marked as multiple sounds detected as being active at the same time. The polyphonic system output requires a suitable procedure for evaluation against a reference. Metrics from neighboring fields such as speech recognition and speaker diarization can be used, but they need to be partially redefined to deal with the overlapping events. We present a review of the most common metrics in the field and the way they are adapted and interpreted in the polyphonic case. We discuss segment-based and event-based definitions of each metric and explain the consequences of instance-based and class-based averaging using a case study. In parallel, we provide a toolbox containing implementations of presented metrics.http://www.mdpi.com/2076-3417/6/6/162pattern recognitionaudio signal processingaudio content analysiscomputational auditory scene analysissound eventseveryday soundspolyphonic sound event detectionevaluation of sound event detection |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Annamaria Mesaros Toni Heittola Tuomas Virtanen |
spellingShingle |
Annamaria Mesaros Toni Heittola Tuomas Virtanen Metrics for Polyphonic Sound Event Detection Applied Sciences pattern recognition audio signal processing audio content analysis computational auditory scene analysis sound events everyday sounds polyphonic sound event detection evaluation of sound event detection |
author_facet |
Annamaria Mesaros Toni Heittola Tuomas Virtanen |
author_sort |
Annamaria Mesaros |
title |
Metrics for Polyphonic Sound Event Detection |
title_short |
Metrics for Polyphonic Sound Event Detection |
title_full |
Metrics for Polyphonic Sound Event Detection |
title_fullStr |
Metrics for Polyphonic Sound Event Detection |
title_full_unstemmed |
Metrics for Polyphonic Sound Event Detection |
title_sort |
metrics for polyphonic sound event detection |
publisher |
MDPI AG |
series |
Applied Sciences |
issn |
2076-3417 |
publishDate |
2016-05-01 |
description |
This paper presents and discusses various metrics proposed for evaluation of polyphonic sound event detection systems used in realistic situations where there are typically multiple sound sources active simultaneously. The system output in this case contains overlapping events, marked as multiple sounds detected as being active at the same time. The polyphonic system output requires a suitable procedure for evaluation against a reference. Metrics from neighboring fields such as speech recognition and speaker diarization can be used, but they need to be partially redefined to deal with the overlapping events. We present a review of the most common metrics in the field and the way they are adapted and interpreted in the polyphonic case. We discuss segment-based and event-based definitions of each metric and explain the consequences of instance-based and class-based averaging using a case study. In parallel, we provide a toolbox containing implementations of presented metrics. |
topic |
pattern recognition audio signal processing audio content analysis computational auditory scene analysis sound events everyday sounds polyphonic sound event detection evaluation of sound event detection |
url |
http://www.mdpi.com/2076-3417/6/6/162 |
work_keys_str_mv |
AT annamariamesaros metricsforpolyphonicsoundeventdetection AT toniheittola metricsforpolyphonicsoundeventdetection AT tuomasvirtanen metricsforpolyphonicsoundeventdetection |
_version_ |
1725920438113009664 |