Talker-identification training using simulations of hybrid CI hearing : generalization to speech recognition and music perception
The speech signal carries two types of information: linguistic information (the message content) and indexical information (acoustic cues about the talker). In the traditional view of speech perception, the acoustic differences among talkers were considered "noise". In this view, the liste...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Language: | English |
Published: |
University of Iowa
2014
|
Subjects: | |
Online Access: | https://ir.uiowa.edu/etd/1317 https://ir.uiowa.edu/cgi/viewcontent.cgi?article=5356&context=etd |
Summary: | The speech signal carries two types of information: linguistic information (the message content) and indexical information (acoustic cues about the talker). In the traditional view of speech perception, the acoustic differences among talkers were considered "noise". In this view, the listeners' task was to strip away unwanted variability to uncover the idealized phonetic representation of the spoken message. A more recent view suggests that both talker information and linguistic information are stored in memory. Rather than being unwanted "noise", talker information aids in speech recognition especially under difficult listening conditions. For example, it has been shown that normal hearing listeners who completed voice recognition training were subsequently better at recognizing speech from familiar versus unfamiliar voices.
For individuals with hearing loss, access to both types of information may be compromised. Some studies have shown that cochlear implant (CI) recipients are relatively poor at using indexical speech information because low-frequency speech cues are poorly conveyed in standard CIs. However, some CI users with preserved residual hearing can now combine acoustic amplification of low frequency information (via a hearing aid) with electrical stimulation in the high frequencies (via the CI). It is referred to as bimodal hearing when a listener uses a CI in one ear and a hearing aid in the opposite ear. A second way electrical and acoustic stimulation is achieved is through a new CI system, the hybrid CI. This device combines electrical stimulation with acoustic hearing in the same ear, via a shortened electrode array that is intended to preserve residual low frequency hearing in the apical portion of the cochlea. It may be that hybrid CI users can learn to use voice information to enhance speech understanding.
This study will assess voice learning and its relationship to talker-discrimination, music perception, and spoken word recognition in simulations of Hybrid CI or bimodal hearing. Specifically, our research questions are as follows: (1) Does training increase talker identification? (2) Does familiarity with the talker or linguistic message enhance spoken word recognition? (3) Does enhanced spectral processing (as demonstrated by improved talker recognition) generalize to non-linguistic tasks such as talker discrimination and music perception tasks?
To address our research questions, we will recruit normal hearing adults to participate in eight talker identification training sessions. Prior to training, subjects will be administered the forward and backward digit span task to assess short-term memory and working memory abilities. We hypothesize that there will be a correlation between the ability to learn voices and memory. Subjects will also complete a talker-discrimination test and a music perception test that require the use of spectral cues. We predict that training will generalize to performances on these tasks. Lastly, a spoken word recognition (SWR) test will be administered before and after talker identification training. The subjects will listen to sentences produced by eight talkers (four male, four female) and verbally repeat what they had heard. Half of the sentences will contain keywords repeated in training and half of the sentences will have keywords not repeated in training. Additionally, subjects will have only heard sentences from half of the talkers during training. We hypothesize that subjects will show an advantage for trained keywords rather than non-trained keywords and will perform better with familiar talkers than unfamiliar talkers. |
---|