Multi-level speech timing control

This thesis describes a model of speech timing, predicting at the syllable level, with sensitivity to rhythmic factors at the foot level, that predicts segmental durations by a process of accommodation into the higher-level timing framework. The model is based on analyses of two large databases of B...

Full description

Bibliographic Details
Main Author:	Campbell, Wilhelm
Published:	University of Sussex 1992
Subjects:	410 Speech synthesis models
Online Access:	http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.283832

id	ndltd-bl.uk-oai-ethos.bl.uk-283832
record_format	oai_dc
spelling	ndltd-bl.uk-oai-ethos.bl.uk-2838322015-09-03T03:17:21ZMulti-level speech timing controlCampbell, Wilhelm1992This thesis describes a model of speech timing, predicting at the syllable level, with sensitivity to rhythmic factors at the foot level, that predicts segmental durations by a process of accommodation into the higher-level timing framework. The model is based on analyses of two large databases of British English speech; one illustrating the range of prosodic variation in the language, the other illustrating segmental duration characteristics in various phonetic environments. Designed for a speech synthesis application, the model also has relevance to linguistic and phonetic theory, and shows that phonological specification of prosodic variation is independent of the phonetic realisation of segmental duration. It also shows, using normalisation of phone-specific timing characteristics, that lengthening of segments within the syllable is of three kinds: prominence-related, applying more to onset segments; boundary-related, applying more to coda segments; and rhythm/rate-related, being more uniform across all component segments. In this model, durations are first predicted at the level of the syllable from consideration of the number of component segments, the nature of the rhyme, and the three types of lengthening. The segmental durations are then constrained to sum to this value by determining an appropriate uniform quantile of their individual distributions. Segmental distributions define the range of likely durations each might show under a given set of conditions; their parameters are predicted from broad-class features of place and manner of articulation, factored for position in the syllable, clustering, stress, and finality. Two parameters determine the segmental duration . pdfs, assuming a Gamma distribution, and one parameter determines the quantile within that pdf to predict the duration of any segment in a given prosodic context. In experimental tests, each level produced durations that closely fitted the data of four speakers of British English, and showed performance rates higher than a comparable model predicting exclusively at the level of the segment.410Speech synthesis modelsUniversity of Sussexhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.283832Electronic Thesis or Dissertation
collection	NDLTD
sources	NDLTD
topic	410 Speech synthesis models
spellingShingle	410 Speech synthesis models Campbell, Wilhelm Multi-level speech timing control
description	This thesis describes a model of speech timing, predicting at the syllable level, with sensitivity to rhythmic factors at the foot level, that predicts segmental durations by a process of accommodation into the higher-level timing framework. The model is based on analyses of two large databases of British English speech; one illustrating the range of prosodic variation in the language, the other illustrating segmental duration characteristics in various phonetic environments. Designed for a speech synthesis application, the model also has relevance to linguistic and phonetic theory, and shows that phonological specification of prosodic variation is independent of the phonetic realisation of segmental duration. It also shows, using normalisation of phone-specific timing characteristics, that lengthening of segments within the syllable is of three kinds: prominence-related, applying more to onset segments; boundary-related, applying more to coda segments; and rhythm/rate-related, being more uniform across all component segments. In this model, durations are first predicted at the level of the syllable from consideration of the number of component segments, the nature of the rhyme, and the three types of lengthening. The segmental durations are then constrained to sum to this value by determining an appropriate uniform quantile of their individual distributions. Segmental distributions define the range of likely durations each might show under a given set of conditions; their parameters are predicted from broad-class features of place and manner of articulation, factored for position in the syllable, clustering, stress, and finality. Two parameters determine the segmental duration . pdfs, assuming a Gamma distribution, and one parameter determines the quantile within that pdf to predict the duration of any segment in a given prosodic context. In experimental tests, each level produced durations that closely fitted the data of four speakers of British English, and showed performance rates higher than a comparable model predicting exclusively at the level of the segment.
author	Campbell, Wilhelm
author_facet	Campbell, Wilhelm
author_sort	Campbell, Wilhelm
title	Multi-level speech timing control
title_short	Multi-level speech timing control
title_full	Multi-level speech timing control
title_fullStr	Multi-level speech timing control
title_full_unstemmed	Multi-level speech timing control
title_sort	multi-level speech timing control
publisher	University of Sussex
publishDate	1992
url	http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.283832
work_keys_str_mv	AT campbellwilhelm multilevelspeechtimingcontrol
_version_	1716817806231601152

Multi-level speech timing control

Similar Items