Articulatory-based Speech Processing Methods for Foreign Accent Conversion

The objective of this dissertation is to develop speech processing methods that enable without altering their identity. We envision accent conversion primarily as a tool for pronunciation training, allowing non-native speakers to hear their native-accented selves. With this application in mind, we p...

Full description

Bibliographic Details
Main Author:	Felps, Daniel
Other Authors:	Gutierrez-Osuna, Ricardo
Format:	Others
Language:	en_US
Published:	2012
Subjects:	speech processing voice conversion accent conversion
Online Access:	http://hdl.handle.net/1969.1/ETD-TAMU-2011-08-9760

id	ndltd-tamu.edu-oai-repository.tamu.edu-1969.1-ETD-TAMU-2011-08-9760
record_format	oai_dc
spelling	ndltd-tamu.edu-oai-repository.tamu.edu-1969.1-ETD-TAMU-2011-08-97602013-01-08T10:44:58ZArticulatory-based Speech Processing Methods for Foreign Accent ConversionFelps, Danielspeech processingvoice conversionaccent conversionThe objective of this dissertation is to develop speech processing methods that enable without altering their identity. We envision accent conversion primarily as a tool for pronunciation training, allowing non-native speakers to hear their native-accented selves. With this application in mind, we present two methods of accent conversion. The first assumes that the voice quality/identity of speech resides in the glottal excitation, while the linguistic content is contained in the vocal tract transfer function. Accent conversion is achieved by convolving the glottal excitation of a non-native speaker with the vocal tract transfer function of a native speaker. The result is perceived as 60 percent less accented, but it is no longer identified as the same individual. The second method of accent conversion selects segments of speech from a corpus of non-native speech based on their acoustic or articulatory similarity to segments from a native speaker. We predict that articulatory features provide a more speaker-independent representation of speech and are therefore better gauges of linguistic similarity across speakers. To test this hypothesis, we collected a custom database containing simultaneous recordings of speech and the positions of important articulators (e.g. lips, jaw, tongue) for a native and non-native speaker. Resequencing speech from a non-native speaker based on articulatory similarity with a native speaker achieved a 20 percent reduction in accent. The approach is particularly appealing for applications in pronunciation training because it modifies speech in a way that produces realistically achievable changes in accent (i.e., since the technique uses sounds already produced by the non-native speaker). A second contribution of this dissertation is the development of subjective and objective measures to assess the performance of accent conversion systems. This is a difficult problem because, in most cases, no ground truth exists. Subjective evaluation is further complicated by the interconnected relationship between accent and identity, but modifications of the stimuli (i.e. reverse speech and voice disguises) allow the two components to be separated. Algorithms to measure objectively accent, quality, and identity are shown to correlate well with their subjective counterparts.Gutierrez-Osuna, Ricardo2012-10-19T15:28:36Z2012-10-22T18:05:57Z2012-10-19T15:28:36Z2012-10-22T18:05:57Z2011-082012-10-19August 2011thesistextapplication/pdfhttp://hdl.handle.net/1969.1/ETD-TAMU-2011-08-9760en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
topic	speech processing voice conversion accent conversion
spellingShingle	speech processing voice conversion accent conversion Felps, Daniel Articulatory-based Speech Processing Methods for Foreign Accent Conversion
description	The objective of this dissertation is to develop speech processing methods that enable without altering their identity. We envision accent conversion primarily as a tool for pronunciation training, allowing non-native speakers to hear their native-accented selves. With this application in mind, we present two methods of accent conversion. The first assumes that the voice quality/identity of speech resides in the glottal excitation, while the linguistic content is contained in the vocal tract transfer function. Accent conversion is achieved by convolving the glottal excitation of a non-native speaker with the vocal tract transfer function of a native speaker. The result is perceived as 60 percent less accented, but it is no longer identified as the same individual. The second method of accent conversion selects segments of speech from a corpus of non-native speech based on their acoustic or articulatory similarity to segments from a native speaker. We predict that articulatory features provide a more speaker-independent representation of speech and are therefore better gauges of linguistic similarity across speakers. To test this hypothesis, we collected a custom database containing simultaneous recordings of speech and the positions of important articulators (e.g. lips, jaw, tongue) for a native and non-native speaker. Resequencing speech from a non-native speaker based on articulatory similarity with a native speaker achieved a 20 percent reduction in accent. The approach is particularly appealing for applications in pronunciation training because it modifies speech in a way that produces realistically achievable changes in accent (i.e., since the technique uses sounds already produced by the non-native speaker). A second contribution of this dissertation is the development of subjective and objective measures to assess the performance of accent conversion systems. This is a difficult problem because, in most cases, no ground truth exists. Subjective evaluation is further complicated by the interconnected relationship between accent and identity, but modifications of the stimuli (i.e. reverse speech and voice disguises) allow the two components to be separated. Algorithms to measure objectively accent, quality, and identity are shown to correlate well with their subjective counterparts.
author2	Gutierrez-Osuna, Ricardo
author_facet	Gutierrez-Osuna, Ricardo Felps, Daniel
author	Felps, Daniel
author_sort	Felps, Daniel
title	Articulatory-based Speech Processing Methods for Foreign Accent Conversion
title_short	Articulatory-based Speech Processing Methods for Foreign Accent Conversion
title_full	Articulatory-based Speech Processing Methods for Foreign Accent Conversion
title_fullStr	Articulatory-based Speech Processing Methods for Foreign Accent Conversion
title_full_unstemmed	Articulatory-based Speech Processing Methods for Foreign Accent Conversion
title_sort	articulatory-based speech processing methods for foreign accent conversion
publishDate	2012
url	http://hdl.handle.net/1969.1/ETD-TAMU-2011-08-9760
work_keys_str_mv	AT felpsdaniel articulatorybasedspeechprocessingmethodsforforeignaccentconversion
_version_	1716505458120851456

Articulatory-based Speech Processing Methods for Foreign Accent Conversion

Similar Items