Joint speaker localization and array calibration using expectation-maximization

Abstract Ad hoc acoustic networks comprising multiple nodes, each of which consists of several microphones, are addressed. From the ad hoc nature of the node constellation, microphone positions are unknown. Hence, typical tasks, such as localization, tracking, and beamforming, cannot be directly app...

Full description

Bibliographic Details
Main Authors: Yuval Dorfan, Ofer Schwartz, Sharon Gannot
Format: Article
Language:English
Published: SpringerOpen 2020-06-01
Series:EURASIP Journal on Audio, Speech, and Music Processing
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13636-020-00177-1
id doaj-4e6f96b0013a4305966293d3f93aa7ef
record_format Article
spelling doaj-4e6f96b0013a4305966293d3f93aa7ef2020-11-25T02:59:27ZengSpringerOpenEURASIP Journal on Audio, Speech, and Music Processing1687-47222020-06-012020111910.1186/s13636-020-00177-1Joint speaker localization and array calibration using expectation-maximizationYuval Dorfan0Ofer Schwartz1Sharon Gannot2Faculty of Engineering, Bar-Ilan UniversityAudio department, CEVA DSPFaculty of Engineering, Bar-Ilan UniversityAbstract Ad hoc acoustic networks comprising multiple nodes, each of which consists of several microphones, are addressed. From the ad hoc nature of the node constellation, microphone positions are unknown. Hence, typical tasks, such as localization, tracking, and beamforming, cannot be directly applied. To tackle this challenging joint multiple speaker localization and array calibration task, we propose a novel variant of the expectation-maximization (EM) algorithm. The coordinates of multiple arrays relative to an anchor array are blindly estimated using naturally uttered speech signals of multiple concurrent speakers. The speakers’ locations, relative to the anchor array, are also estimated. The inter-distances of the microphones in each array, as well their orientations, are assumed known, which is a reasonable assumption for many modern mobile devices (in outdoor and in a several indoor scenarios). The well-known initialization problem of the batch EM algorithm is circumvented by an incremental procedure, also derived here. The proposed algorithm is tested by an extensive simulation study.http://link.springer.com/article/10.1186/s13636-020-00177-1Wireless acoustic sensor networkJoint calibration and localizationExpectation-maximizationMicrophone arraySimultaneous speakersW-disjoint
collection DOAJ
language English
format Article
sources DOAJ
author Yuval Dorfan
Ofer Schwartz
Sharon Gannot
spellingShingle Yuval Dorfan
Ofer Schwartz
Sharon Gannot
Joint speaker localization and array calibration using expectation-maximization
EURASIP Journal on Audio, Speech, and Music Processing
Wireless acoustic sensor network
Joint calibration and localization
Expectation-maximization
Microphone array
Simultaneous speakers
W-disjoint
author_facet Yuval Dorfan
Ofer Schwartz
Sharon Gannot
author_sort Yuval Dorfan
title Joint speaker localization and array calibration using expectation-maximization
title_short Joint speaker localization and array calibration using expectation-maximization
title_full Joint speaker localization and array calibration using expectation-maximization
title_fullStr Joint speaker localization and array calibration using expectation-maximization
title_full_unstemmed Joint speaker localization and array calibration using expectation-maximization
title_sort joint speaker localization and array calibration using expectation-maximization
publisher SpringerOpen
series EURASIP Journal on Audio, Speech, and Music Processing
issn 1687-4722
publishDate 2020-06-01
description Abstract Ad hoc acoustic networks comprising multiple nodes, each of which consists of several microphones, are addressed. From the ad hoc nature of the node constellation, microphone positions are unknown. Hence, typical tasks, such as localization, tracking, and beamforming, cannot be directly applied. To tackle this challenging joint multiple speaker localization and array calibration task, we propose a novel variant of the expectation-maximization (EM) algorithm. The coordinates of multiple arrays relative to an anchor array are blindly estimated using naturally uttered speech signals of multiple concurrent speakers. The speakers’ locations, relative to the anchor array, are also estimated. The inter-distances of the microphones in each array, as well their orientations, are assumed known, which is a reasonable assumption for many modern mobile devices (in outdoor and in a several indoor scenarios). The well-known initialization problem of the batch EM algorithm is circumvented by an incremental procedure, also derived here. The proposed algorithm is tested by an extensive simulation study.
topic Wireless acoustic sensor network
Joint calibration and localization
Expectation-maximization
Microphone array
Simultaneous speakers
W-disjoint
url http://link.springer.com/article/10.1186/s13636-020-00177-1
work_keys_str_mv AT yuvaldorfan jointspeakerlocalizationandarraycalibrationusingexpectationmaximization
AT oferschwartz jointspeakerlocalizationandarraycalibrationusingexpectationmaximization
AT sharongannot jointspeakerlocalizationandarraycalibrationusingexpectationmaximization
_version_ 1724702306875736064