A Comparison of Human against Machine-Classification of Spatial Audio Scenes in Binaural Recordings of Music

The purpose of this paper is to compare the performance of human listeners against the selected machine learning algorithms in the task of the classification of spatial audio scenes in binaural recordings of music under practical conditions. The three scenes were subject to classification: (1) music...

Full description

Bibliographic Details
Main Authors:	Sławomir K. Zieliński, Hyunkook Lee, Paweł Antoniuk, Oskar Dadan
Format:	Article
Language:	English
Published:	MDPI AG 2020-08-01
Series:	Applied Sciences
Subjects:	spatial audio scene classification spatial audio information retrieval convolutional neural networks deep learning
Online Access:	https://www.mdpi.com/2076-3417/10/17/5956

id	doaj-cda656af25ec493986c6f7c87a64aaf3
record_format	Article
spelling	doaj-cda656af25ec493986c6f7c87a64aaf32020-11-25T03:41:38ZengMDPI AGApplied Sciences2076-34172020-08-01105956595610.3390/app10175956A Comparison of Human against Machine-Classification of Spatial Audio Scenes in Binaural Recordings of MusicSławomir K. Zieliński0Hyunkook Lee1Paweł Antoniuk2Oskar Dadan3Faculty of Computer Science, Białystok University of Technology, 15-351 Białystok, PolandApplied Psychoacoustics Laboratory (APL), University of Huddersfield, Huddersfield HD1 3DH, UKFaculty of Computer Science, Białystok University of Technology, 15-351 Białystok, PolandFaculty of Computer Science, Białystok University of Technology, 15-351 Białystok, PolandThe purpose of this paper is to compare the performance of human listeners against the selected machine learning algorithms in the task of the classification of spatial audio scenes in binaural recordings of music under practical conditions. The three scenes were subject to classification: (1) music ensemble (a group of musical sources) located in the front, (2) music ensemble located at the back, and (3) music ensemble distributed around a listener. In the listening test, undertaken remotely over the Internet, human listeners reached the classification accuracy of 42.5%. For the listeners who passed the post-screening test, the accuracy was greater, approaching 60%. The above classification task was also undertaken automatically using four machine learning algorithms: convolutional neural network, support vector machines, extreme gradient boosting framework, and logistic regression. The machine learning algorithms substantially outperformed human listeners, with the classification accuracy reaching 84%, when tested under the binaural-room-impulse-response (BRIR) matched conditions. However, when the algorithms were tested under the BRIR mismatched scenario, the accuracy obtained by the algorithms was comparable to that exhibited by the listeners who passed the post-screening test, implying that the machine learning algorithms capability to perform in unknown electro-acoustic conditions needs to be further improved.https://www.mdpi.com/2076-3417/10/17/5956spatial audio scene classificationspatial audio information retrievalconvolutional neural networksdeep learning
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Sławomir K. Zieliński Hyunkook Lee Paweł Antoniuk Oskar Dadan
spellingShingle	Sławomir K. Zieliński Hyunkook Lee Paweł Antoniuk Oskar Dadan A Comparison of Human against Machine-Classification of Spatial Audio Scenes in Binaural Recordings of Music Applied Sciences spatial audio scene classification spatial audio information retrieval convolutional neural networks deep learning
author_facet	Sławomir K. Zieliński Hyunkook Lee Paweł Antoniuk Oskar Dadan
author_sort	Sławomir K. Zieliński
title	A Comparison of Human against Machine-Classification of Spatial Audio Scenes in Binaural Recordings of Music
title_short	A Comparison of Human against Machine-Classification of Spatial Audio Scenes in Binaural Recordings of Music
title_full	A Comparison of Human against Machine-Classification of Spatial Audio Scenes in Binaural Recordings of Music
title_fullStr	A Comparison of Human against Machine-Classification of Spatial Audio Scenes in Binaural Recordings of Music
title_full_unstemmed	A Comparison of Human against Machine-Classification of Spatial Audio Scenes in Binaural Recordings of Music
title_sort	comparison of human against machine-classification of spatial audio scenes in binaural recordings of music
publisher	MDPI AG
series	Applied Sciences
issn	2076-3417
publishDate	2020-08-01
description	The purpose of this paper is to compare the performance of human listeners against the selected machine learning algorithms in the task of the classification of spatial audio scenes in binaural recordings of music under practical conditions. The three scenes were subject to classification: (1) music ensemble (a group of musical sources) located in the front, (2) music ensemble located at the back, and (3) music ensemble distributed around a listener. In the listening test, undertaken remotely over the Internet, human listeners reached the classification accuracy of 42.5%. For the listeners who passed the post-screening test, the accuracy was greater, approaching 60%. The above classification task was also undertaken automatically using four machine learning algorithms: convolutional neural network, support vector machines, extreme gradient boosting framework, and logistic regression. The machine learning algorithms substantially outperformed human listeners, with the classification accuracy reaching 84%, when tested under the binaural-room-impulse-response (BRIR) matched conditions. However, when the algorithms were tested under the BRIR mismatched scenario, the accuracy obtained by the algorithms was comparable to that exhibited by the listeners who passed the post-screening test, implying that the machine learning algorithms capability to perform in unknown electro-acoustic conditions needs to be further improved.
topic	spatial audio scene classification spatial audio information retrieval convolutional neural networks deep learning
url	https://www.mdpi.com/2076-3417/10/17/5956
work_keys_str_mv	AT sławomirkzielinski acomparisonofhumanagainstmachineclassificationofspatialaudioscenesinbinauralrecordingsofmusic AT hyunkooklee acomparisonofhumanagainstmachineclassificationofspatialaudioscenesinbinauralrecordingsofmusic AT pawełantoniuk acomparisonofhumanagainstmachineclassificationofspatialaudioscenesinbinauralrecordingsofmusic AT oskardadan acomparisonofhumanagainstmachineclassificationofspatialaudioscenesinbinauralrecordingsofmusic AT sławomirkzielinski comparisonofhumanagainstmachineclassificationofspatialaudioscenesinbinauralrecordingsofmusic AT hyunkooklee comparisonofhumanagainstmachineclassificationofspatialaudioscenesinbinauralrecordingsofmusic AT pawełantoniuk comparisonofhumanagainstmachineclassificationofspatialaudioscenesinbinauralrecordingsofmusic AT oskardadan comparisonofhumanagainstmachineclassificationofspatialaudioscenesinbinauralrecordingsofmusic
_version_	1724529105467080704

A Comparison of Human against Machine-Classification of Spatial Audio Scenes in Binaural Recordings of Music

Similar Items