Experiments on Detection of Voiced Hesitations in Russian Spontaneous Speech

The development and popularity of voice-user interfaces made spontaneous speech processing an important research field. One of the main focus areas in this field is automatic speech recognition (ASR) that enables the recognition and translation of spoken language into text by computers. However, ASR...

Full description

Bibliographic Details
Main Authors:	Vasilisa Verkhodanova, Vladimir Shapranov
Format:	Article
Language:	English
Published:	Hindawi Limited 2016-01-01
Series:	Journal of Electrical and Computer Engineering
Online Access:	http://dx.doi.org/10.1155/2016/2013658

Description
Summary:	The development and popularity of voice-user interfaces made spontaneous speech processing an important research field. One of the main focus areas in this field is automatic speech recognition (ASR) that enables the recognition and translation of spoken language into text by computers. However, ASR systems often work less efficiently for spontaneous than for read speech, since the former differs from any other type of speech in many ways. And the presence of speech disfluencies is its prominent characteristic. These phenomena are an important feature in human-human communication and at the same time they are a challenging obstacle for the speech processing tasks. In this paper we address an issue of voiced hesitations (filled pauses and sound lengthenings) detection in Russian spontaneous speech by utilizing different machine learning techniques, from grid search and gradient descent in rule-based approaches to such data-driven ones as ELM and SVM based on the automatically extracted acoustic features. Experimental results on the mixed and quality diverse corpus of spontaneous Russian speech indicate the efficiency of the techniques for the task in question, with SVM outperforming other methods.
ISSN:	2090-0147 2090-0155

Experiments on Detection of Voiced Hesitations in Russian Spontaneous Speech

Similar Items