A Gestalt inference model for auditory scene segregation.

Our current understanding of how the brain segregates auditory scenes into meaningful objects is in line with a Gestaltism framework. These Gestalt principles suggest a theory of how different attributes of the soundscape are extracted then bound together into separate groups that reflect different...

Full description

Bibliographic Details
Main Authors: Debmalya Chakrabarty, Mounya Elhilali
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2019-01-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1006711
id doaj-fbf82975b48841cd9912acfbb9aa8905
record_format Article
spelling doaj-fbf82975b48841cd9912acfbb9aa89052021-04-21T15:12:02ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582019-01-01151e100671110.1371/journal.pcbi.1006711A Gestalt inference model for auditory scene segregation.Debmalya ChakrabartyMounya ElhilaliOur current understanding of how the brain segregates auditory scenes into meaningful objects is in line with a Gestaltism framework. These Gestalt principles suggest a theory of how different attributes of the soundscape are extracted then bound together into separate groups that reflect different objects or streams present in the scene. These cues are thought to reflect the underlying statistical structure of natural sounds in a similar way that statistics of natural images are closely linked to the principles that guide figure-ground segregation and object segmentation in vision. In the present study, we leverage inference in stochastic neural networks to learn emergent grouping cues directly from natural soundscapes including speech, music and sounds in nature. The model learns a hierarchy of local and global spectro-temporal attributes reminiscent of simultaneous and sequential Gestalt cues that underlie the organization of auditory scenes. These mappings operate at multiple time scales to analyze an incoming complex scene and are then fused using a Hebbian network that binds together coherent features into perceptually-segregated auditory objects. The proposed architecture successfully emulates a wide range of well established auditory scene segregation phenomena and quantifies the complimentary role of segregation and binding cues in driving auditory scene segregation.https://doi.org/10.1371/journal.pcbi.1006711
collection DOAJ
language English
format Article
sources DOAJ
author Debmalya Chakrabarty
Mounya Elhilali
spellingShingle Debmalya Chakrabarty
Mounya Elhilali
A Gestalt inference model for auditory scene segregation.
PLoS Computational Biology
author_facet Debmalya Chakrabarty
Mounya Elhilali
author_sort Debmalya Chakrabarty
title A Gestalt inference model for auditory scene segregation.
title_short A Gestalt inference model for auditory scene segregation.
title_full A Gestalt inference model for auditory scene segregation.
title_fullStr A Gestalt inference model for auditory scene segregation.
title_full_unstemmed A Gestalt inference model for auditory scene segregation.
title_sort gestalt inference model for auditory scene segregation.
publisher Public Library of Science (PLoS)
series PLoS Computational Biology
issn 1553-734X
1553-7358
publishDate 2019-01-01
description Our current understanding of how the brain segregates auditory scenes into meaningful objects is in line with a Gestaltism framework. These Gestalt principles suggest a theory of how different attributes of the soundscape are extracted then bound together into separate groups that reflect different objects or streams present in the scene. These cues are thought to reflect the underlying statistical structure of natural sounds in a similar way that statistics of natural images are closely linked to the principles that guide figure-ground segregation and object segmentation in vision. In the present study, we leverage inference in stochastic neural networks to learn emergent grouping cues directly from natural soundscapes including speech, music and sounds in nature. The model learns a hierarchy of local and global spectro-temporal attributes reminiscent of simultaneous and sequential Gestalt cues that underlie the organization of auditory scenes. These mappings operate at multiple time scales to analyze an incoming complex scene and are then fused using a Hebbian network that binds together coherent features into perceptually-segregated auditory objects. The proposed architecture successfully emulates a wide range of well established auditory scene segregation phenomena and quantifies the complimentary role of segregation and binding cues in driving auditory scene segregation.
url https://doi.org/10.1371/journal.pcbi.1006711
work_keys_str_mv AT debmalyachakrabarty agestaltinferencemodelforauditoryscenesegregation
AT mounyaelhilali agestaltinferencemodelforauditoryscenesegregation
AT debmalyachakrabarty gestaltinferencemodelforauditoryscenesegregation
AT mounyaelhilali gestaltinferencemodelforauditoryscenesegregation
_version_ 1714667853501693952