Estimating a User's Internal State before the First Input Utterance

This paper describes a method for estimating the internal state of a user of a spoken dialog system before his/her first input utterance. When actually using a dialog-based system, the user is often perplexed by the prompt. A typical system provides more detailed information to a user who is taking...

Full description

Bibliographic Details
Main Authors: Yuya Chiba, Akinori Ito
Format: Article
Language:English
Published: Hindawi Limited 2012-01-01
Series:Advances in Human-Computer Interaction
Online Access:http://dx.doi.org/10.1155/2012/865362
id doaj-8b01c39072424762b84e3303f1191f01
record_format Article
spelling doaj-8b01c39072424762b84e3303f1191f012020-11-24T20:55:04ZengHindawi LimitedAdvances in Human-Computer Interaction1687-58931687-59072012-01-01201210.1155/2012/865362865362Estimating a User's Internal State before the First Input UtteranceYuya Chiba0Akinori Ito1Graduate School of Engineering, Tohoku University, 6-6-5 Aramaki aza Aoba, Aoba-ku, Sendai, Miyagi 980-8579, JapanGraduate School of Engineering, Tohoku University, 6-6-5 Aramaki aza Aoba, Aoba-ku, Sendai, Miyagi 980-8579, JapanThis paper describes a method for estimating the internal state of a user of a spoken dialog system before his/her first input utterance. When actually using a dialog-based system, the user is often perplexed by the prompt. A typical system provides more detailed information to a user who is taking time to make an input utterance, but such assistance is nuisance if the user is merely considering how to answer the prompt. To respond appropriately, the spoken dialog system should be able to consider the user’s internal state before the user’s input. Conventional studies on user modeling have focused on the linguistic information of the utterance for estimating the user’s internal state, but this approach cannot estimate the user’s state until the end of the user’s first utterance. Therefore, we focused on the user’s nonverbal output such as fillers, silence, or head-moving until the beginning of the input utterance. The experimental data was collected on a Wizard of Oz basis, and the labels were decided by five evaluators. Finally, we conducted a discrimination experiment with the trained user model using combined features. As a three-class discrimination result, we obtained about 85% accuracy in an open test.http://dx.doi.org/10.1155/2012/865362
collection DOAJ
language English
format Article
sources DOAJ
author Yuya Chiba
Akinori Ito
spellingShingle Yuya Chiba
Akinori Ito
Estimating a User's Internal State before the First Input Utterance
Advances in Human-Computer Interaction
author_facet Yuya Chiba
Akinori Ito
author_sort Yuya Chiba
title Estimating a User's Internal State before the First Input Utterance
title_short Estimating a User's Internal State before the First Input Utterance
title_full Estimating a User's Internal State before the First Input Utterance
title_fullStr Estimating a User's Internal State before the First Input Utterance
title_full_unstemmed Estimating a User's Internal State before the First Input Utterance
title_sort estimating a user's internal state before the first input utterance
publisher Hindawi Limited
series Advances in Human-Computer Interaction
issn 1687-5893
1687-5907
publishDate 2012-01-01
description This paper describes a method for estimating the internal state of a user of a spoken dialog system before his/her first input utterance. When actually using a dialog-based system, the user is often perplexed by the prompt. A typical system provides more detailed information to a user who is taking time to make an input utterance, but such assistance is nuisance if the user is merely considering how to answer the prompt. To respond appropriately, the spoken dialog system should be able to consider the user’s internal state before the user’s input. Conventional studies on user modeling have focused on the linguistic information of the utterance for estimating the user’s internal state, but this approach cannot estimate the user’s state until the end of the user’s first utterance. Therefore, we focused on the user’s nonverbal output such as fillers, silence, or head-moving until the beginning of the input utterance. The experimental data was collected on a Wizard of Oz basis, and the labels were decided by five evaluators. Finally, we conducted a discrimination experiment with the trained user model using combined features. As a three-class discrimination result, we obtained about 85% accuracy in an open test.
url http://dx.doi.org/10.1155/2012/865362
work_keys_str_mv AT yuyachiba estimatingausersinternalstatebeforethefirstinpututterance
AT akinoriito estimatingausersinternalstatebeforethefirstinpututterance
_version_ 1716792751217967104