Detrending the Waveforms of Steady-State Vowels

Steady-state vowels are vowels that are uttered with a momentarily fixed vocal tract configuration and with steady vibration of the vocal folds. In this steady-state, the vowel waveform appears as a quasi-periodic string of elementary units called pitch periods. Humans perceive this quasi-periodic r...

Full description

Bibliographic Details
Main Authors: Marnix Van Soom, Bart de Boer
Format: Article
Language:English
Published: MDPI AG 2020-03-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/22/3/331
id doaj-ac3ddae9cb1f4d2da6e577b3b5ec69af
record_format Article
spelling doaj-ac3ddae9cb1f4d2da6e577b3b5ec69af2020-11-25T01:54:14ZengMDPI AGEntropy1099-43002020-03-0122333110.3390/e22030331e22030331Detrending the Waveforms of Steady-State VowelsMarnix Van Soom0Bart de Boer1Artificial Intelligence Laboratory, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, BelgiumArtificial Intelligence Laboratory, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, BelgiumSteady-state vowels are vowels that are uttered with a momentarily fixed vocal tract configuration and with steady vibration of the vocal folds. In this steady-state, the vowel waveform appears as a quasi-periodic string of elementary units called pitch periods. Humans perceive this quasi-periodic regularity as a definite pitch. Likewise, so-called pitch-synchronous methods exploit this regularity by using the duration of the pitch periods as a natural time scale for their analysis. In this work, we present a simple pitch-synchronous method using a Bayesian approach for estimating formants that slightly generalizes the basic approach of modeling the pitch periods as a superposition of decaying sinusoids, one for each vowel formant, by explicitly taking into account the additional low-frequency content in the waveform which arises not from formants but rather from the glottal pulse. We model this low-frequency content in the time domain as a polynomial trend function that is added to the decaying sinusoids. The problem then reduces to a rather familiar one in macroeconomics: estimate the cycles (our decaying sinusoids) independently from the trend (our polynomial trend function); in other words, detrend the waveform of steady-state waveforms. We show how to do this efficiently.https://www.mdpi.com/1099-4300/22/3/331formantsteady-statevoweldetrendingacoustic phoneticssource-filter theoryprobability theoryuncertainty quantificationmodel averagingnested sampling
collection DOAJ
language English
format Article
sources DOAJ
author Marnix Van Soom
Bart de Boer
spellingShingle Marnix Van Soom
Bart de Boer
Detrending the Waveforms of Steady-State Vowels
Entropy
formant
steady-state
vowel
detrending
acoustic phonetics
source-filter theory
probability theory
uncertainty quantification
model averaging
nested sampling
author_facet Marnix Van Soom
Bart de Boer
author_sort Marnix Van Soom
title Detrending the Waveforms of Steady-State Vowels
title_short Detrending the Waveforms of Steady-State Vowels
title_full Detrending the Waveforms of Steady-State Vowels
title_fullStr Detrending the Waveforms of Steady-State Vowels
title_full_unstemmed Detrending the Waveforms of Steady-State Vowels
title_sort detrending the waveforms of steady-state vowels
publisher MDPI AG
series Entropy
issn 1099-4300
publishDate 2020-03-01
description Steady-state vowels are vowels that are uttered with a momentarily fixed vocal tract configuration and with steady vibration of the vocal folds. In this steady-state, the vowel waveform appears as a quasi-periodic string of elementary units called pitch periods. Humans perceive this quasi-periodic regularity as a definite pitch. Likewise, so-called pitch-synchronous methods exploit this regularity by using the duration of the pitch periods as a natural time scale for their analysis. In this work, we present a simple pitch-synchronous method using a Bayesian approach for estimating formants that slightly generalizes the basic approach of modeling the pitch periods as a superposition of decaying sinusoids, one for each vowel formant, by explicitly taking into account the additional low-frequency content in the waveform which arises not from formants but rather from the glottal pulse. We model this low-frequency content in the time domain as a polynomial trend function that is added to the decaying sinusoids. The problem then reduces to a rather familiar one in macroeconomics: estimate the cycles (our decaying sinusoids) independently from the trend (our polynomial trend function); in other words, detrend the waveform of steady-state waveforms. We show how to do this efficiently.
topic formant
steady-state
vowel
detrending
acoustic phonetics
source-filter theory
probability theory
uncertainty quantification
model averaging
nested sampling
url https://www.mdpi.com/1099-4300/22/3/331
work_keys_str_mv AT marnixvansoom detrendingthewaveformsofsteadystatevowels
AT bartdeboer detrendingthewaveformsofsteadystatevowels
_version_ 1724988491823054848