Vowel Sound Synthesis from Electroencephalography during Listening and Recalling
Recent advances in brain imaging technology have furthered our knowledge of the neural basis of auditory and speech processing, often via contributions from invasive brain signal recording and stimulation studies conducted intraoperatively. Herein, an approach for synthesizing vowel sounds straightf...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2021-02-01
|
Series: | Advanced Intelligent Systems |
Subjects: | |
Online Access: | https://doi.org/10.1002/aisy.202000164 |
id |
doaj-1044e9dfc3f64ec3852e7614c9367d65 |
---|---|
record_format |
Article |
spelling |
doaj-1044e9dfc3f64ec3852e7614c9367d652021-02-22T15:24:48ZengWileyAdvanced Intelligent Systems2640-45672021-02-0132n/an/a10.1002/aisy.202000164Vowel Sound Synthesis from Electroencephalography during Listening and RecallingWataru Akashi0Hiroyuki Kambara1Yousuke Ogata2Yasuharu Koike3Ludovico Minati4Natsue Yoshimura5Institute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 JapanInstitute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 JapanInstitute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 JapanInstitute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 JapanInstitute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 JapanInstitute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 JapanRecent advances in brain imaging technology have furthered our knowledge of the neural basis of auditory and speech processing, often via contributions from invasive brain signal recording and stimulation studies conducted intraoperatively. Herein, an approach for synthesizing vowel sounds straightforwardly from scalp‐recorded electroencephalography (EEG), a noninvasive neurophysiological recording method is demonstrated. Given cortical current signals derived from the EEG acquired while human participants listen to and recall (i.e., imagined) two vowels, /a/ and /i/, sound parameters are estimated by a convolutional neural network (CNN). The speech synthesized from the estimated parameters is sufficiently natural to achieve recognition rates >85% during a subsequent sound discrimination task. Notably, the CNN identifies the involvement of the brain areas mediating the “what” auditory stream, namely the superior, middle temporal, and Heschl's gyri, demonstrating the efficacy of the computational method in extracting auditory‐related information from neuroelectrical activity. Differences in cortical sound representation between listening versus recalling are further revealed, such that the fusiform, calcarine, and anterior cingulate gyri contributes during listening, whereas the inferior occipital gyrus is engaged during recollection. The proposed approach can expand the scope of EEG in decoding auditory perception that requires high spatial and temporal resolution.https://doi.org/10.1002/aisy.202000164brain activity signalscortical current source estimationsdeep-learningelectroencephalographyspeech syntheses |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Wataru Akashi Hiroyuki Kambara Yousuke Ogata Yasuharu Koike Ludovico Minati Natsue Yoshimura |
spellingShingle |
Wataru Akashi Hiroyuki Kambara Yousuke Ogata Yasuharu Koike Ludovico Minati Natsue Yoshimura Vowel Sound Synthesis from Electroencephalography during Listening and Recalling Advanced Intelligent Systems brain activity signals cortical current source estimations deep-learning electroencephalography speech syntheses |
author_facet |
Wataru Akashi Hiroyuki Kambara Yousuke Ogata Yasuharu Koike Ludovico Minati Natsue Yoshimura |
author_sort |
Wataru Akashi |
title |
Vowel Sound Synthesis from Electroencephalography during Listening and Recalling |
title_short |
Vowel Sound Synthesis from Electroencephalography during Listening and Recalling |
title_full |
Vowel Sound Synthesis from Electroencephalography during Listening and Recalling |
title_fullStr |
Vowel Sound Synthesis from Electroencephalography during Listening and Recalling |
title_full_unstemmed |
Vowel Sound Synthesis from Electroencephalography during Listening and Recalling |
title_sort |
vowel sound synthesis from electroencephalography during listening and recalling |
publisher |
Wiley |
series |
Advanced Intelligent Systems |
issn |
2640-4567 |
publishDate |
2021-02-01 |
description |
Recent advances in brain imaging technology have furthered our knowledge of the neural basis of auditory and speech processing, often via contributions from invasive brain signal recording and stimulation studies conducted intraoperatively. Herein, an approach for synthesizing vowel sounds straightforwardly from scalp‐recorded electroencephalography (EEG), a noninvasive neurophysiological recording method is demonstrated. Given cortical current signals derived from the EEG acquired while human participants listen to and recall (i.e., imagined) two vowels, /a/ and /i/, sound parameters are estimated by a convolutional neural network (CNN). The speech synthesized from the estimated parameters is sufficiently natural to achieve recognition rates >85% during a subsequent sound discrimination task. Notably, the CNN identifies the involvement of the brain areas mediating the “what” auditory stream, namely the superior, middle temporal, and Heschl's gyri, demonstrating the efficacy of the computational method in extracting auditory‐related information from neuroelectrical activity. Differences in cortical sound representation between listening versus recalling are further revealed, such that the fusiform, calcarine, and anterior cingulate gyri contributes during listening, whereas the inferior occipital gyrus is engaged during recollection. The proposed approach can expand the scope of EEG in decoding auditory perception that requires high spatial and temporal resolution. |
topic |
brain activity signals cortical current source estimations deep-learning electroencephalography speech syntheses |
url |
https://doi.org/10.1002/aisy.202000164 |
work_keys_str_mv |
AT wataruakashi vowelsoundsynthesisfromelectroencephalographyduringlisteningandrecalling AT hiroyukikambara vowelsoundsynthesisfromelectroencephalographyduringlisteningandrecalling AT yousukeogata vowelsoundsynthesisfromelectroencephalographyduringlisteningandrecalling AT yasuharukoike vowelsoundsynthesisfromelectroencephalographyduringlisteningandrecalling AT ludovicominati vowelsoundsynthesisfromelectroencephalographyduringlisteningandrecalling AT natsueyoshimura vowelsoundsynthesisfromelectroencephalographyduringlisteningandrecalling |
_version_ |
1724256598733357056 |