Glottal Source Contribution to Higher Order Modes in the Finite Element Synthesis of Vowels

Articulatory speech synthesis has long been based on one-dimensional (1D) approaches. They assume plane wave propagation within the vocal tract and disregard higher order modes that typically appear above 5 kHz. However, such modes may be relevant in obtaining a more natural voice, especially for ph...

Full description

Bibliographic Details
Main Authors: Marc Freixes, Marc Arnela, Joan Claudi Socoró, Francesc Alías, Oriol Guasch
Format: Article
Language:English
Published: MDPI AG 2019-10-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/9/21/4535
id doaj-970f599aee454b33a5dbad9ab8f2d15b
record_format Article
spelling doaj-970f599aee454b33a5dbad9ab8f2d15b2020-11-25T00:56:43ZengMDPI AGApplied Sciences2076-34172019-10-01921453510.3390/app9214535app9214535Glottal Source Contribution to Higher Order Modes in the Finite Element Synthesis of VowelsMarc Freixes0Marc Arnela1Joan Claudi Socoró2Francesc Alías3Oriol Guasch4GTM—Grup de recerca en Tecnologies Mèdia, La Salle—Universitat Ramon Llull, Quatre Camins, 30, 08022 Barcelona, SpainGTM—Grup de recerca en Tecnologies Mèdia, La Salle—Universitat Ramon Llull, Quatre Camins, 30, 08022 Barcelona, SpainGTM—Grup de recerca en Tecnologies Mèdia, La Salle—Universitat Ramon Llull, Quatre Camins, 30, 08022 Barcelona, SpainGTM—Grup de recerca en Tecnologies Mèdia, La Salle—Universitat Ramon Llull, Quatre Camins, 30, 08022 Barcelona, SpainGTM—Grup de recerca en Tecnologies Mèdia, La Salle—Universitat Ramon Llull, Quatre Camins, 30, 08022 Barcelona, SpainArticulatory speech synthesis has long been based on one-dimensional (1D) approaches. They assume plane wave propagation within the vocal tract and disregard higher order modes that typically appear above 5 kHz. However, such modes may be relevant in obtaining a more natural voice, especially for phonation types with significant high frequency energy (HFE) content. This work studies the contribution of the glottal source at high frequencies in the 3D numerical synthesis of vowels. The spoken vocal range is explored using an LF (Liljencrants–Fant) model enhanced with aspiration noise and controlled by the <inline-formula> <math display="inline"> <semantics> <msub> <mi>R</mi> <mi>d</mi> </msub> </semantics> </math> </inline-formula> glottal shape parameter. The vowels [ɑ], [i], and [u] are generated with a finite element method (FEM) using realistic 3D vocal tract geometries obtained from magnetic resonance imaging (MRI), as well as simplified straight vocal tracts of a circular cross-sectional area. The symmetry of the latter prevents the onset of higher order modes. Thus, the comparison between realistic and simplified geometries enables us to analyse the influence of such modes. The simulations indicate that higher order modes may be perceptually relevant, particularly for tense phonations (lower <inline-formula> <math display="inline"> <semantics> <msub> <mi>R</mi> <mi>d</mi> </msub> </semantics> </math> </inline-formula> values) and/or high fundamental frequency values, <inline-formula> <math display="inline"> <semantics> <mrow> <mi>F</mi> <mn>0</mn> </mrow> </semantics> </math> </inline-formula>s. Conversely, vowels with a lax phonation and/or low F0s may result in inaudible HFE levels, especially if aspiration noise is not considered in the glottal source model.https://www.mdpi.com/2076-3417/9/21/4535voice productionhigher order modeshigh frequency energyglottal sourcelf modelnumerical simulationfinite element method
collection DOAJ
language English
format Article
sources DOAJ
author Marc Freixes
Marc Arnela
Joan Claudi Socoró
Francesc Alías
Oriol Guasch
spellingShingle Marc Freixes
Marc Arnela
Joan Claudi Socoró
Francesc Alías
Oriol Guasch
Glottal Source Contribution to Higher Order Modes in the Finite Element Synthesis of Vowels
Applied Sciences
voice production
higher order modes
high frequency energy
glottal source
lf model
numerical simulation
finite element method
author_facet Marc Freixes
Marc Arnela
Joan Claudi Socoró
Francesc Alías
Oriol Guasch
author_sort Marc Freixes
title Glottal Source Contribution to Higher Order Modes in the Finite Element Synthesis of Vowels
title_short Glottal Source Contribution to Higher Order Modes in the Finite Element Synthesis of Vowels
title_full Glottal Source Contribution to Higher Order Modes in the Finite Element Synthesis of Vowels
title_fullStr Glottal Source Contribution to Higher Order Modes in the Finite Element Synthesis of Vowels
title_full_unstemmed Glottal Source Contribution to Higher Order Modes in the Finite Element Synthesis of Vowels
title_sort glottal source contribution to higher order modes in the finite element synthesis of vowels
publisher MDPI AG
series Applied Sciences
issn 2076-3417
publishDate 2019-10-01
description Articulatory speech synthesis has long been based on one-dimensional (1D) approaches. They assume plane wave propagation within the vocal tract and disregard higher order modes that typically appear above 5 kHz. However, such modes may be relevant in obtaining a more natural voice, especially for phonation types with significant high frequency energy (HFE) content. This work studies the contribution of the glottal source at high frequencies in the 3D numerical synthesis of vowels. The spoken vocal range is explored using an LF (Liljencrants–Fant) model enhanced with aspiration noise and controlled by the <inline-formula> <math display="inline"> <semantics> <msub> <mi>R</mi> <mi>d</mi> </msub> </semantics> </math> </inline-formula> glottal shape parameter. The vowels [ɑ], [i], and [u] are generated with a finite element method (FEM) using realistic 3D vocal tract geometries obtained from magnetic resonance imaging (MRI), as well as simplified straight vocal tracts of a circular cross-sectional area. The symmetry of the latter prevents the onset of higher order modes. Thus, the comparison between realistic and simplified geometries enables us to analyse the influence of such modes. The simulations indicate that higher order modes may be perceptually relevant, particularly for tense phonations (lower <inline-formula> <math display="inline"> <semantics> <msub> <mi>R</mi> <mi>d</mi> </msub> </semantics> </math> </inline-formula> values) and/or high fundamental frequency values, <inline-formula> <math display="inline"> <semantics> <mrow> <mi>F</mi> <mn>0</mn> </mrow> </semantics> </math> </inline-formula>s. Conversely, vowels with a lax phonation and/or low F0s may result in inaudible HFE levels, especially if aspiration noise is not considered in the glottal source model.
topic voice production
higher order modes
high frequency energy
glottal source
lf model
numerical simulation
finite element method
url https://www.mdpi.com/2076-3417/9/21/4535
work_keys_str_mv AT marcfreixes glottalsourcecontributiontohigherordermodesinthefiniteelementsynthesisofvowels
AT marcarnela glottalsourcecontributiontohigherordermodesinthefiniteelementsynthesisofvowels
AT joanclaudisocoro glottalsourcecontributiontohigherordermodesinthefiniteelementsynthesisofvowels
AT francescalias glottalsourcecontributiontohigherordermodesinthefiniteelementsynthesisofvowels
AT oriolguasch glottalsourcecontributiontohigherordermodesinthefiniteelementsynthesisofvowels
_version_ 1725225819972829184