Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique

Statistical speech and language processing techniques, requiring large amounts of training data, are currently state-of-the-art in automatic speech recognition. For high-resourced, international languages this data is widely available, while for under-resourced languages the lack of data poses se...

Full description

Bibliographic Details
Main Authors: CUCU, H., BUZO, A., BESACIER, L., BURILEANU, C.
Format: Article
Language:English
Published: Stefan cel Mare University of Suceava 2015-02-01
Series:Advances in Electrical and Computer Engineering
Subjects:
Online Access:http://dx.doi.org/10.4316/AECE.2015.01009
id doaj-c97353b41379442495f4061416c95f8e
record_format Article
spelling doaj-c97353b41379442495f4061416c95f8e2020-11-24T23:12:05ZengStefan cel Mare University of SuceavaAdvances in Electrical and Computer Engineering1582-74451844-76002015-02-01151636810.4316/AECE.2015.01009Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training TechniqueCUCU, H.BUZO, A.BESACIER, L.BURILEANU, C. Statistical speech and language processing techniques, requiring large amounts of training data, are currently state-of-the-art in automatic speech recognition. For high-resourced, international languages this data is widely available, while for under-resourced languages the lack of data poses serious problems. Unsupervised acoustic modeling can offer a cost and time effective way of creating a solid acoustic model for any under-resourced language. This study describes a novel unsupervised acoustic model training method and evaluates it on speech data in an under-resourced language: Romanian. The key novel factor of the method is the usage of two complementary seed ASR systems to produce high quality transcriptions, with a Character Error Rate (ChER) < 5%, for initially untranscribed speech data. The methodology leads to a relative Word Error Rate (WER) improvement of more than 10% when 100 hours of untranscribed speech are used.http://dx.doi.org/10.4316/AECE.2015.01009speech recognitionunder-resourced languagesunsupervised acoustic modelingunsupervised training
collection DOAJ
language English
format Article
sources DOAJ
author CUCU, H.
BUZO, A.
BESACIER, L.
BURILEANU, C.
spellingShingle CUCU, H.
BUZO, A.
BESACIER, L.
BURILEANU, C.
Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique
Advances in Electrical and Computer Engineering
speech recognition
under-resourced languages
unsupervised acoustic modeling
unsupervised training
author_facet CUCU, H.
BUZO, A.
BESACIER, L.
BURILEANU, C.
author_sort CUCU, H.
title Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique
title_short Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique
title_full Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique
title_fullStr Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique
title_full_unstemmed Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique
title_sort enhancing asr systems for under-resourced languages through a novel unsupervised acoustic model training technique
publisher Stefan cel Mare University of Suceava
series Advances in Electrical and Computer Engineering
issn 1582-7445
1844-7600
publishDate 2015-02-01
description Statistical speech and language processing techniques, requiring large amounts of training data, are currently state-of-the-art in automatic speech recognition. For high-resourced, international languages this data is widely available, while for under-resourced languages the lack of data poses serious problems. Unsupervised acoustic modeling can offer a cost and time effective way of creating a solid acoustic model for any under-resourced language. This study describes a novel unsupervised acoustic model training method and evaluates it on speech data in an under-resourced language: Romanian. The key novel factor of the method is the usage of two complementary seed ASR systems to produce high quality transcriptions, with a Character Error Rate (ChER) < 5%, for initially untranscribed speech data. The methodology leads to a relative Word Error Rate (WER) improvement of more than 10% when 100 hours of untranscribed speech are used.
topic speech recognition
under-resourced languages
unsupervised acoustic modeling
unsupervised training
url http://dx.doi.org/10.4316/AECE.2015.01009
work_keys_str_mv AT cucuh enhancingasrsystemsforunderresourcedlanguagesthroughanovelunsupervisedacousticmodeltrainingtechnique
AT buzoa enhancingasrsystemsforunderresourcedlanguagesthroughanovelunsupervisedacousticmodeltrainingtechnique
AT besacierl enhancingasrsystemsforunderresourcedlanguagesthroughanovelunsupervisedacousticmodeltrainingtechnique
AT burileanuc enhancingasrsystemsforunderresourcedlanguagesthroughanovelunsupervisedacousticmodeltrainingtechnique
_version_ 1725602496111443968