Acoustic-Prosodic Entrainment in Human-Human and Human-Computer Dialogue

Entrainment (sometimes called adaptation or alignment) is the tendency of human speakers to adapt to or imitate characteristics of their interlocutors' behavior. This work focuses on entrainment on acoustic-prosodic features. Acoustic-prosodic entrainment has been extensively studied but is not...

Full description

Bibliographic Details
Main Author: Levitan, Rivka
Language:English
Published: 2014
Subjects:
Online Access:https://doi.org/10.7916/D8GT5KCH
Description
Summary:Entrainment (sometimes called adaptation or alignment) is the tendency of human speakers to adapt to or imitate characteristics of their interlocutors' behavior. This work focuses on entrainment on acoustic-prosodic features. Acoustic-prosodic entrainment has been extensively studied but is not well understood. In particular, it is difficult to compare the results of different studies, since entrainment is usually measured in different ways, reflect- ing disparate conceptualizations of the phenomenon. In the first part of this thesis, we look for evidence of entrainment on a variety of acoustic-prosodic features according to various conceptualizations, and show that human speakers of both Standard American English and Mandarin Chinese entrain to each other globally and locally, in synchrony, and that this entrainment can be constant or convergent. We explore the relationship between entrainment and gender and show that entrainment on some acoustic-prosodic features is related to social behavior and dialogue coordination. In addition, we show that humans entrain in a novel domain, backchannel-inviting cues, and propose and test a novel hypothesis: that entrainment will be stronger in the case of an outlier feature value. In the second part of the thesis, we describe a method for flexibly and dynamically entraining a TTS voice to multiple acoustic-prosodic features of a user's input utterances, and show in an exploratory study that users prefer an entraining avatar to one that does not entrain, are more likely to ask its advice, and choose more positive adjectives to describe its voice. This work introduces a coherent view of entrainment in both familiar and novel domains. Our results add to the body of knowledge of entrainment in human-human conversations and propose new directions for making use of that knowledge to enhance human-computer interactions.