Prosody resources and symbolic prosodic features for automated phrase break prediction

It is universally recognised that humans process speech and language in chunks, each meaningful in itself. Any two renditions or assimilations of a given sentence will exhibit similarities and discrepancies in chunking, where speakers and readers use pauses and inflections to mark phrase breaks. Thi...

Full description

Bibliographic Details
Main Author:	Brierley, Claire
Other Authors:	Atwell, E.
Published:	University of Leeds 2011
Subjects:	005.3
Online Access:	http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.544555

id	ndltd-bl.uk-oai-ethos.bl.uk-544555
record_format	oai_dc
spelling	ndltd-bl.uk-oai-ethos.bl.uk-5445552017-10-04T03:34:56ZProsody resources and symbolic prosodic features for automated phrase break predictionBrierley, ClaireAtwell, E.2011It is universally recognised that humans process speech and language in chunks, each meaningful in itself. Any two renditions or assimilations of a given sentence will exhibit similarities and discrepancies in chunking, where speakers and readers use pauses and inflections to mark phrase breaks. This thesis reviews deterministic and stochastic approaches to phrase break prediction, plus datasets, evaluation metrics and feature sets. Early rule-based experimental work with a chunk parser gives rise to motivational insights, namely: the limitations of traditional features (syntax and punctuation) and deficiency of prosody in current phrasing models, and the problem of evaluating performance when the training set only represents one phrasing variant. Such insights inform resource creation in the form of ProPOSEL, a prosody and part-of-speech English lexicon, to create a domain-independent knowledge source, plus prosodic annotation and text analytics tool for corpus-based research, supported by a comprehensive software tutorial. Future applications of ProPOSEL include prosody-motivated speech-to-viseme generation for "talking heads" and expressive avatar creation. Here, ProPOSEL is used to build the ProPOSEC dataset by merging and annotating two versions of the Spoken English Corpus. Linguistic data arrays in this dataset are first mined for prosodic boundary correlates and later re-conceptualised as training instances for supervised machine learning. This thesis contends that native English speakers use certain sound patterns (e.g. diphthongs and triphthongs) as linguistic signs for phrase breaks, having observed these same patterns at rhythmic junctures in poetry. Pre-boundary lexical items bearing these complex vowels and gold-standard boundary annotations are found to be highly correlated via the chi-squared statistic in different genres, including seventeenth century English verse, and for multiple speakers. Complex vowels and other symbolic prosodic features are then implemented in a phrasing model to evaluate efficacy for phrase break prediction. The ultimate challenge is to better understand how sound and rhythm, as components of the linguistic sign, inform psycholinguistic chunking even during silent reading.005.3University of Leedshttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.544555http://etheses.whiterose.ac.uk/2038/Electronic Thesis or Dissertation
collection	NDLTD
sources	NDLTD
topic	005.3
spellingShingle	005.3 Brierley, Claire Prosody resources and symbolic prosodic features for automated phrase break prediction
description	It is universally recognised that humans process speech and language in chunks, each meaningful in itself. Any two renditions or assimilations of a given sentence will exhibit similarities and discrepancies in chunking, where speakers and readers use pauses and inflections to mark phrase breaks. This thesis reviews deterministic and stochastic approaches to phrase break prediction, plus datasets, evaluation metrics and feature sets. Early rule-based experimental work with a chunk parser gives rise to motivational insights, namely: the limitations of traditional features (syntax and punctuation) and deficiency of prosody in current phrasing models, and the problem of evaluating performance when the training set only represents one phrasing variant. Such insights inform resource creation in the form of ProPOSEL, a prosody and part-of-speech English lexicon, to create a domain-independent knowledge source, plus prosodic annotation and text analytics tool for corpus-based research, supported by a comprehensive software tutorial. Future applications of ProPOSEL include prosody-motivated speech-to-viseme generation for "talking heads" and expressive avatar creation. Here, ProPOSEL is used to build the ProPOSEC dataset by merging and annotating two versions of the Spoken English Corpus. Linguistic data arrays in this dataset are first mined for prosodic boundary correlates and later re-conceptualised as training instances for supervised machine learning. This thesis contends that native English speakers use certain sound patterns (e.g. diphthongs and triphthongs) as linguistic signs for phrase breaks, having observed these same patterns at rhythmic junctures in poetry. Pre-boundary lexical items bearing these complex vowels and gold-standard boundary annotations are found to be highly correlated via the chi-squared statistic in different genres, including seventeenth century English verse, and for multiple speakers. Complex vowels and other symbolic prosodic features are then implemented in a phrasing model to evaluate efficacy for phrase break prediction. The ultimate challenge is to better understand how sound and rhythm, as components of the linguistic sign, inform psycholinguistic chunking even during silent reading.
author2	Atwell, E.
author_facet	Atwell, E. Brierley, Claire
author	Brierley, Claire
author_sort	Brierley, Claire
title	Prosody resources and symbolic prosodic features for automated phrase break prediction
title_short	Prosody resources and symbolic prosodic features for automated phrase break prediction
title_full	Prosody resources and symbolic prosodic features for automated phrase break prediction
title_fullStr	Prosody resources and symbolic prosodic features for automated phrase break prediction
title_full_unstemmed	Prosody resources and symbolic prosodic features for automated phrase break prediction
title_sort	prosody resources and symbolic prosodic features for automated phrase break prediction
publisher	University of Leeds
publishDate	2011
url	http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.544555
work_keys_str_mv	AT brierleyclaire prosodyresourcesandsymbolicprosodicfeaturesforautomatedphrasebreakprediction
_version_	1718545132356108288

Prosody resources and symbolic prosodic features for automated phrase break prediction

Similar Items