Vowel Sound Synthesis from Electroencephalography during Listening and Recalling

Recent advances in brain imaging technology have furthered our knowledge of the neural basis of auditory and speech processing, often via contributions from invasive brain signal recording and stimulation studies conducted intraoperatively. Herein, an approach for synthesizing vowel sounds straightf...

Full description

Bibliographic Details
Main Authors: Wataru Akashi, Hiroyuki Kambara, Yousuke Ogata, Yasuharu Koike, Ludovico Minati, Natsue Yoshimura
Format: Article
Language:English
Published: Wiley 2021-02-01
Series:Advanced Intelligent Systems
Subjects:
Online Access:https://doi.org/10.1002/aisy.202000164
id doaj-1044e9dfc3f64ec3852e7614c9367d65
record_format Article
spelling doaj-1044e9dfc3f64ec3852e7614c9367d652021-02-22T15:24:48ZengWileyAdvanced Intelligent Systems2640-45672021-02-0132n/an/a10.1002/aisy.202000164Vowel Sound Synthesis from Electroencephalography during Listening and RecallingWataru Akashi0Hiroyuki Kambara1Yousuke Ogata2Yasuharu Koike3Ludovico Minati4Natsue Yoshimura5Institute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 JapanInstitute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 JapanInstitute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 JapanInstitute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 JapanInstitute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 JapanInstitute of Innovative Research Tokyo Institute of Technology 4259 Nagatsuta-cho Midori-ku Yokohama 226-8503 JapanRecent advances in brain imaging technology have furthered our knowledge of the neural basis of auditory and speech processing, often via contributions from invasive brain signal recording and stimulation studies conducted intraoperatively. Herein, an approach for synthesizing vowel sounds straightforwardly from scalp‐recorded electroencephalography (EEG), a noninvasive neurophysiological recording method is demonstrated. Given cortical current signals derived from the EEG acquired while human participants listen to and recall (i.e., imagined) two vowels, /a/ and /i/, sound parameters are estimated by a convolutional neural network (CNN). The speech synthesized from the estimated parameters is sufficiently natural to achieve recognition rates >85% during a subsequent sound discrimination task. Notably, the CNN identifies the involvement of the brain areas mediating the “what” auditory stream, namely the superior, middle temporal, and Heschl's gyri, demonstrating the efficacy of the computational method in extracting auditory‐related information from neuroelectrical activity. Differences in cortical sound representation between listening versus recalling are further revealed, such that the fusiform, calcarine, and anterior cingulate gyri contributes during listening, whereas the inferior occipital gyrus is engaged during recollection. The proposed approach can expand the scope of EEG in decoding auditory perception that requires high spatial and temporal resolution.https://doi.org/10.1002/aisy.202000164brain activity signalscortical current source estimationsdeep-learningelectroencephalographyspeech syntheses
collection DOAJ
language English
format Article
sources DOAJ
author Wataru Akashi
Hiroyuki Kambara
Yousuke Ogata
Yasuharu Koike
Ludovico Minati
Natsue Yoshimura
spellingShingle Wataru Akashi
Hiroyuki Kambara
Yousuke Ogata
Yasuharu Koike
Ludovico Minati
Natsue Yoshimura
Vowel Sound Synthesis from Electroencephalography during Listening and Recalling
Advanced Intelligent Systems
brain activity signals
cortical current source estimations
deep-learning
electroencephalography
speech syntheses
author_facet Wataru Akashi
Hiroyuki Kambara
Yousuke Ogata
Yasuharu Koike
Ludovico Minati
Natsue Yoshimura
author_sort Wataru Akashi
title Vowel Sound Synthesis from Electroencephalography during Listening and Recalling
title_short Vowel Sound Synthesis from Electroencephalography during Listening and Recalling
title_full Vowel Sound Synthesis from Electroencephalography during Listening and Recalling
title_fullStr Vowel Sound Synthesis from Electroencephalography during Listening and Recalling
title_full_unstemmed Vowel Sound Synthesis from Electroencephalography during Listening and Recalling
title_sort vowel sound synthesis from electroencephalography during listening and recalling
publisher Wiley
series Advanced Intelligent Systems
issn 2640-4567
publishDate 2021-02-01
description Recent advances in brain imaging technology have furthered our knowledge of the neural basis of auditory and speech processing, often via contributions from invasive brain signal recording and stimulation studies conducted intraoperatively. Herein, an approach for synthesizing vowel sounds straightforwardly from scalp‐recorded electroencephalography (EEG), a noninvasive neurophysiological recording method is demonstrated. Given cortical current signals derived from the EEG acquired while human participants listen to and recall (i.e., imagined) two vowels, /a/ and /i/, sound parameters are estimated by a convolutional neural network (CNN). The speech synthesized from the estimated parameters is sufficiently natural to achieve recognition rates >85% during a subsequent sound discrimination task. Notably, the CNN identifies the involvement of the brain areas mediating the “what” auditory stream, namely the superior, middle temporal, and Heschl's gyri, demonstrating the efficacy of the computational method in extracting auditory‐related information from neuroelectrical activity. Differences in cortical sound representation between listening versus recalling are further revealed, such that the fusiform, calcarine, and anterior cingulate gyri contributes during listening, whereas the inferior occipital gyrus is engaged during recollection. The proposed approach can expand the scope of EEG in decoding auditory perception that requires high spatial and temporal resolution.
topic brain activity signals
cortical current source estimations
deep-learning
electroencephalography
speech syntheses
url https://doi.org/10.1002/aisy.202000164
work_keys_str_mv AT wataruakashi vowelsoundsynthesisfromelectroencephalographyduringlisteningandrecalling
AT hiroyukikambara vowelsoundsynthesisfromelectroencephalographyduringlisteningandrecalling
AT yousukeogata vowelsoundsynthesisfromelectroencephalographyduringlisteningandrecalling
AT yasuharukoike vowelsoundsynthesisfromelectroencephalographyduringlisteningandrecalling
AT ludovicominati vowelsoundsynthesisfromelectroencephalographyduringlisteningandrecalling
AT natsueyoshimura vowelsoundsynthesisfromelectroencephalographyduringlisteningandrecalling
_version_ 1724256598733357056