Mapping an Auditory Scene Using Eye Tracking Glasses

The cocktail party problem introduced in 1953 describes the ability to focus auditory attention in a noisy environment epitomised by a cocktail party. An individual with normal hearing uses several cues to unmask talkers of interest, such cues often lacks for people with hearing loss. This thesis ex...

Full description

Bibliographic Details
Main Authors: Fredriksson, Alfred, Wallin, Joakim
Format: Others
Language:English
Published: Linköpings universitet, Reglerteknik 2020
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-170849
id ndltd-UPSALLA1-oai-DiVA.org-liu-170849
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-liu-1708492020-11-06T05:34:44ZMapping an Auditory Scene Using Eye Tracking GlassesengFredriksson, AlfredWallin, JoakimLinköpings universitet, ReglerteknikLinköpings universitet, Reglerteknik2020Sensor FusionEngineering and TechnologyTeknik och teknologierThe cocktail party problem introduced in 1953 describes the ability to focus auditory attention in a noisy environment epitomised by a cocktail party. An individual with normal hearing uses several cues to unmask talkers of interest, such cues often lacks for people with hearing loss. This thesis explores the possibility to use a pair of glasses equipped with an inertial measurement unit (IMU), monocular camera and eye tacker to estimate an auditory scene and estimate the attention of the person wearing the glasses. Three main areas of interest have been investigated: estimating head orientation of the user; track faces in the scene and determine talker of interest using gaze. Implemented on a hearing aid, this solution could be used to artificially unmask talkers in a noisy environment. The head orientation of the user has been estimated with an extended Kalman filter (\EKF) algorithm, with a constant velocity model and different sets of measurements: accelerometer; gyrosope; monocular visual odometry (MVO); gaze estimated bias (GEB). An intrinsic property of IMU sensors is a drift in yaw. A method using eye data and gyroscope measurements to estimate gyroscope bias has been investigated and is called GEB. The MVO methods investigated use either optical flow to track features in succeeding frames or a key frame approach to match features over multiple frames.Using estimated head orientation and face detection software, faces have been tracked since they can be assumed as regions of interest in a cocktail party environment. A constant position EKF with a nearest neighbour approach has been used for tracking. Further, eye data retrieved from the glasses has been analyzed to investigate the relation between gaze direction and current talker during conversations. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-170849application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic Sensor Fusion
Engineering and Technology
Teknik och teknologier
spellingShingle Sensor Fusion
Engineering and Technology
Teknik och teknologier
Fredriksson, Alfred
Wallin, Joakim
Mapping an Auditory Scene Using Eye Tracking Glasses
description The cocktail party problem introduced in 1953 describes the ability to focus auditory attention in a noisy environment epitomised by a cocktail party. An individual with normal hearing uses several cues to unmask talkers of interest, such cues often lacks for people with hearing loss. This thesis explores the possibility to use a pair of glasses equipped with an inertial measurement unit (IMU), monocular camera and eye tacker to estimate an auditory scene and estimate the attention of the person wearing the glasses. Three main areas of interest have been investigated: estimating head orientation of the user; track faces in the scene and determine talker of interest using gaze. Implemented on a hearing aid, this solution could be used to artificially unmask talkers in a noisy environment. The head orientation of the user has been estimated with an extended Kalman filter (\EKF) algorithm, with a constant velocity model and different sets of measurements: accelerometer; gyrosope; monocular visual odometry (MVO); gaze estimated bias (GEB). An intrinsic property of IMU sensors is a drift in yaw. A method using eye data and gyroscope measurements to estimate gyroscope bias has been investigated and is called GEB. The MVO methods investigated use either optical flow to track features in succeeding frames or a key frame approach to match features over multiple frames.Using estimated head orientation and face detection software, faces have been tracked since they can be assumed as regions of interest in a cocktail party environment. A constant position EKF with a nearest neighbour approach has been used for tracking. Further, eye data retrieved from the glasses has been analyzed to investigate the relation between gaze direction and current talker during conversations.
author Fredriksson, Alfred
Wallin, Joakim
author_facet Fredriksson, Alfred
Wallin, Joakim
author_sort Fredriksson, Alfred
title Mapping an Auditory Scene Using Eye Tracking Glasses
title_short Mapping an Auditory Scene Using Eye Tracking Glasses
title_full Mapping an Auditory Scene Using Eye Tracking Glasses
title_fullStr Mapping an Auditory Scene Using Eye Tracking Glasses
title_full_unstemmed Mapping an Auditory Scene Using Eye Tracking Glasses
title_sort mapping an auditory scene using eye tracking glasses
publisher Linköpings universitet, Reglerteknik
publishDate 2020
url http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-170849
work_keys_str_mv AT fredrikssonalfred mappinganauditorysceneusingeyetrackingglasses
AT wallinjoakim mappinganauditorysceneusingeyetrackingglasses
_version_ 1719355844002316288