Metrics for Polyphonic Sound Event Detection

This paper presents and discusses various metrics proposed for evaluation of polyphonic sound event detection systems used in realistic situations where there are typically multiple sound sources active simultaneously. The system output in this case contains overlapping events, marked as multiple so...

Full description

Bibliographic Details
Main Authors: Annamaria Mesaros, Toni Heittola, Tuomas Virtanen
Format: Article
Language:English
Published: MDPI AG 2016-05-01
Series:Applied Sciences
Subjects:
Online Access:http://www.mdpi.com/2076-3417/6/6/162
id doaj-75a0b3c4043c4f23a4dda15c5db850d0
record_format Article
spelling doaj-75a0b3c4043c4f23a4dda15c5db850d02020-11-24T21:41:41ZengMDPI AGApplied Sciences2076-34172016-05-016616210.3390/app6060162app6060162Metrics for Polyphonic Sound Event DetectionAnnamaria Mesaros0Toni Heittola1Tuomas Virtanen2Department of Signal Processing, Tampere University of Technology, P.O. Box 553, Tampere FI-33101, FinlandDepartment of Signal Processing, Tampere University of Technology, P.O. Box 553, Tampere FI-33101, FinlandDepartment of Signal Processing, Tampere University of Technology, P.O. Box 553, Tampere FI-33101, FinlandThis paper presents and discusses various metrics proposed for evaluation of polyphonic sound event detection systems used in realistic situations where there are typically multiple sound sources active simultaneously. The system output in this case contains overlapping events, marked as multiple sounds detected as being active at the same time. The polyphonic system output requires a suitable procedure for evaluation against a reference. Metrics from neighboring fields such as speech recognition and speaker diarization can be used, but they need to be partially redefined to deal with the overlapping events. We present a review of the most common metrics in the field and the way they are adapted and interpreted in the polyphonic case. We discuss segment-based and event-based definitions of each metric and explain the consequences of instance-based and class-based averaging using a case study. In parallel, we provide a toolbox containing implementations of presented metrics.http://www.mdpi.com/2076-3417/6/6/162pattern recognitionaudio signal processingaudio content analysiscomputational auditory scene analysissound eventseveryday soundspolyphonic sound event detectionevaluation of sound event detection
collection DOAJ
language English
format Article
sources DOAJ
author Annamaria Mesaros
Toni Heittola
Tuomas Virtanen
spellingShingle Annamaria Mesaros
Toni Heittola
Tuomas Virtanen
Metrics for Polyphonic Sound Event Detection
Applied Sciences
pattern recognition
audio signal processing
audio content analysis
computational auditory scene analysis
sound events
everyday sounds
polyphonic sound event detection
evaluation of sound event detection
author_facet Annamaria Mesaros
Toni Heittola
Tuomas Virtanen
author_sort Annamaria Mesaros
title Metrics for Polyphonic Sound Event Detection
title_short Metrics for Polyphonic Sound Event Detection
title_full Metrics for Polyphonic Sound Event Detection
title_fullStr Metrics for Polyphonic Sound Event Detection
title_full_unstemmed Metrics for Polyphonic Sound Event Detection
title_sort metrics for polyphonic sound event detection
publisher MDPI AG
series Applied Sciences
issn 2076-3417
publishDate 2016-05-01
description This paper presents and discusses various metrics proposed for evaluation of polyphonic sound event detection systems used in realistic situations where there are typically multiple sound sources active simultaneously. The system output in this case contains overlapping events, marked as multiple sounds detected as being active at the same time. The polyphonic system output requires a suitable procedure for evaluation against a reference. Metrics from neighboring fields such as speech recognition and speaker diarization can be used, but they need to be partially redefined to deal with the overlapping events. We present a review of the most common metrics in the field and the way they are adapted and interpreted in the polyphonic case. We discuss segment-based and event-based definitions of each metric and explain the consequences of instance-based and class-based averaging using a case study. In parallel, we provide a toolbox containing implementations of presented metrics.
topic pattern recognition
audio signal processing
audio content analysis
computational auditory scene analysis
sound events
everyday sounds
polyphonic sound event detection
evaluation of sound event detection
url http://www.mdpi.com/2076-3417/6/6/162
work_keys_str_mv AT annamariamesaros metricsforpolyphonicsoundeventdetection
AT toniheittola metricsforpolyphonicsoundeventdetection
AT tuomasvirtanen metricsforpolyphonicsoundeventdetection
_version_ 1725920438113009664