Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique
Statistical speech and language processing techniques, requiring large amounts of training data, are currently state-of-the-art in automatic speech recognition. For high-resourced, international languages this data is widely available, while for under-resourced languages the lack of data poses se...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Stefan cel Mare University of Suceava
2015-02-01
|
Series: | Advances in Electrical and Computer Engineering |
Subjects: | |
Online Access: | http://dx.doi.org/10.4316/AECE.2015.01009 |
id |
doaj-c97353b41379442495f4061416c95f8e |
---|---|
record_format |
Article |
spelling |
doaj-c97353b41379442495f4061416c95f8e2020-11-24T23:12:05ZengStefan cel Mare University of SuceavaAdvances in Electrical and Computer Engineering1582-74451844-76002015-02-01151636810.4316/AECE.2015.01009Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training TechniqueCUCU, H.BUZO, A.BESACIER, L.BURILEANU, C. Statistical speech and language processing techniques, requiring large amounts of training data, are currently state-of-the-art in automatic speech recognition. For high-resourced, international languages this data is widely available, while for under-resourced languages the lack of data poses serious problems. Unsupervised acoustic modeling can offer a cost and time effective way of creating a solid acoustic model for any under-resourced language. This study describes a novel unsupervised acoustic model training method and evaluates it on speech data in an under-resourced language: Romanian. The key novel factor of the method is the usage of two complementary seed ASR systems to produce high quality transcriptions, with a Character Error Rate (ChER) < 5%, for initially untranscribed speech data. The methodology leads to a relative Word Error Rate (WER) improvement of more than 10% when 100 hours of untranscribed speech are used.http://dx.doi.org/10.4316/AECE.2015.01009speech recognitionunder-resourced languagesunsupervised acoustic modelingunsupervised training |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
CUCU, H. BUZO, A. BESACIER, L. BURILEANU, C. |
spellingShingle |
CUCU, H. BUZO, A. BESACIER, L. BURILEANU, C. Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique Advances in Electrical and Computer Engineering speech recognition under-resourced languages unsupervised acoustic modeling unsupervised training |
author_facet |
CUCU, H. BUZO, A. BESACIER, L. BURILEANU, C. |
author_sort |
CUCU, H. |
title |
Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique |
title_short |
Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique |
title_full |
Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique |
title_fullStr |
Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique |
title_full_unstemmed |
Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique |
title_sort |
enhancing asr systems for under-resourced languages through a novel unsupervised acoustic model training technique |
publisher |
Stefan cel Mare University of Suceava |
series |
Advances in Electrical and Computer Engineering |
issn |
1582-7445 1844-7600 |
publishDate |
2015-02-01 |
description |
Statistical speech and language processing techniques, requiring large amounts of training data, are currently state-of-the-art
in automatic speech recognition. For high-resourced, international languages this data is widely available, while for under-resourced
languages the lack of data poses serious problems. Unsupervised acoustic modeling can offer a cost and time effective way of creating
a solid acoustic model for any under-resourced language. This study describes a novel unsupervised acoustic model training method
and evaluates it on speech data in an under-resourced language: Romanian. The key novel factor of the method is the usage of two
complementary seed ASR systems to produce high quality transcriptions, with a Character Error Rate (ChER) < 5%, for initially untranscribed
speech data. The methodology leads to a relative Word Error Rate (WER) improvement of more than 10% when 100 hours of untranscribed speech
are used. |
topic |
speech recognition under-resourced languages unsupervised acoustic modeling unsupervised training |
url |
http://dx.doi.org/10.4316/AECE.2015.01009 |
work_keys_str_mv |
AT cucuh enhancingasrsystemsforunderresourcedlanguagesthroughanovelunsupervisedacousticmodeltrainingtechnique AT buzoa enhancingasrsystemsforunderresourcedlanguagesthroughanovelunsupervisedacousticmodeltrainingtechnique AT besacierl enhancingasrsystemsforunderresourcedlanguagesthroughanovelunsupervisedacousticmodeltrainingtechnique AT burileanuc enhancingasrsystemsforunderresourcedlanguagesthroughanovelunsupervisedacousticmodeltrainingtechnique |
_version_ |
1725602496111443968 |