Articulatory-based Speech Processing Methods for Foreign Accent Conversion

The objective of this dissertation is to develop speech processing methods that enable without altering their identity. We envision accent conversion primarily as a tool for pronunciation training, allowing non-native speakers to hear their native-accented selves. With this application in mind, we p...

Full description

Bibliographic Details
Main Author: Felps, Daniel
Other Authors: Gutierrez-Osuna, Ricardo
Format: Others
Language:en_US
Published: 2012
Subjects:
Online Access:http://hdl.handle.net/1969.1/ETD-TAMU-2011-08-9760
id ndltd-tamu.edu-oai-repository.tamu.edu-1969.1-ETD-TAMU-2011-08-9760
record_format oai_dc
spelling ndltd-tamu.edu-oai-repository.tamu.edu-1969.1-ETD-TAMU-2011-08-97602013-01-08T10:44:58ZArticulatory-based Speech Processing Methods for Foreign Accent ConversionFelps, Danielspeech processingvoice conversionaccent conversionThe objective of this dissertation is to develop speech processing methods that enable without altering their identity. We envision accent conversion primarily as a tool for pronunciation training, allowing non-native speakers to hear their native-accented selves. With this application in mind, we present two methods of accent conversion. The first assumes that the voice quality/identity of speech resides in the glottal excitation, while the linguistic content is contained in the vocal tract transfer function. Accent conversion is achieved by convolving the glottal excitation of a non-native speaker with the vocal tract transfer function of a native speaker. The result is perceived as 60 percent less accented, but it is no longer identified as the same individual. The second method of accent conversion selects segments of speech from a corpus of non-native speech based on their acoustic or articulatory similarity to segments from a native speaker. We predict that articulatory features provide a more speaker-independent representation of speech and are therefore better gauges of linguistic similarity across speakers. To test this hypothesis, we collected a custom database containing simultaneous recordings of speech and the positions of important articulators (e.g. lips, jaw, tongue) for a native and non-native speaker. Resequencing speech from a non-native speaker based on articulatory similarity with a native speaker achieved a 20 percent reduction in accent. The approach is particularly appealing for applications in pronunciation training because it modifies speech in a way that produces realistically achievable changes in accent (i.e., since the technique uses sounds already produced by the non-native speaker). A second contribution of this dissertation is the development of subjective and objective measures to assess the performance of accent conversion systems. This is a difficult problem because, in most cases, no ground truth exists. Subjective evaluation is further complicated by the interconnected relationship between accent and identity, but modifications of the stimuli (i.e. reverse speech and voice disguises) allow the two components to be separated. Algorithms to measure objectively accent, quality, and identity are shown to correlate well with their subjective counterparts.Gutierrez-Osuna, Ricardo2012-10-19T15:28:36Z2012-10-22T18:05:57Z2012-10-19T15:28:36Z2012-10-22T18:05:57Z2011-082012-10-19August 2011thesistextapplication/pdfhttp://hdl.handle.net/1969.1/ETD-TAMU-2011-08-9760en_US
collection NDLTD
language en_US
format Others
sources NDLTD
topic speech processing
voice conversion
accent conversion
spellingShingle speech processing
voice conversion
accent conversion
Felps, Daniel
Articulatory-based Speech Processing Methods for Foreign Accent Conversion
description The objective of this dissertation is to develop speech processing methods that enable without altering their identity. We envision accent conversion primarily as a tool for pronunciation training, allowing non-native speakers to hear their native-accented selves. With this application in mind, we present two methods of accent conversion. The first assumes that the voice quality/identity of speech resides in the glottal excitation, while the linguistic content is contained in the vocal tract transfer function. Accent conversion is achieved by convolving the glottal excitation of a non-native speaker with the vocal tract transfer function of a native speaker. The result is perceived as 60 percent less accented, but it is no longer identified as the same individual. The second method of accent conversion selects segments of speech from a corpus of non-native speech based on their acoustic or articulatory similarity to segments from a native speaker. We predict that articulatory features provide a more speaker-independent representation of speech and are therefore better gauges of linguistic similarity across speakers. To test this hypothesis, we collected a custom database containing simultaneous recordings of speech and the positions of important articulators (e.g. lips, jaw, tongue) for a native and non-native speaker. Resequencing speech from a non-native speaker based on articulatory similarity with a native speaker achieved a 20 percent reduction in accent. The approach is particularly appealing for applications in pronunciation training because it modifies speech in a way that produces realistically achievable changes in accent (i.e., since the technique uses sounds already produced by the non-native speaker). A second contribution of this dissertation is the development of subjective and objective measures to assess the performance of accent conversion systems. This is a difficult problem because, in most cases, no ground truth exists. Subjective evaluation is further complicated by the interconnected relationship between accent and identity, but modifications of the stimuli (i.e. reverse speech and voice disguises) allow the two components to be separated. Algorithms to measure objectively accent, quality, and identity are shown to correlate well with their subjective counterparts.
author2 Gutierrez-Osuna, Ricardo
author_facet Gutierrez-Osuna, Ricardo
Felps, Daniel
author Felps, Daniel
author_sort Felps, Daniel
title Articulatory-based Speech Processing Methods for Foreign Accent Conversion
title_short Articulatory-based Speech Processing Methods for Foreign Accent Conversion
title_full Articulatory-based Speech Processing Methods for Foreign Accent Conversion
title_fullStr Articulatory-based Speech Processing Methods for Foreign Accent Conversion
title_full_unstemmed Articulatory-based Speech Processing Methods for Foreign Accent Conversion
title_sort articulatory-based speech processing methods for foreign accent conversion
publishDate 2012
url http://hdl.handle.net/1969.1/ETD-TAMU-2011-08-9760
work_keys_str_mv AT felpsdaniel articulatorybasedspeechprocessingmethodsforforeignaccentconversion
_version_ 1716505458120851456