LMC-SMCA: A New Active Learning Method in ASR

In Automatic Speech Recognition (ASR), transcribed data take substantial effort to obtain. It is worthwhile to explore how to selective the samples with more information from un-transcribed datapool to get a better model with the limited cost. Therefore, active learning in ASR becomes a research top...

Full description

Bibliographic Details
Main Authors:	Xiusong Sun, Bo Wang, Shaohan Liu, Tingxiang Lu, Xin Shan, Qun Yang
Format:	Article
Language:	English
Published:	IEEE 2021-01-01
Series:	IEEE Access
Subjects:	Speech recognition active learning committee-based certainty-based methods
Online Access:	https://ieeexplore.ieee.org/document/9363163/

id	doaj-9ceba31f16ea4dd0845dd396b9a543f1
record_format	Article
spelling	doaj-9ceba31f16ea4dd0845dd396b9a543f12021-03-30T15:31:14ZengIEEEIEEE Access2169-35362021-01-019370113702110.1109/ACCESS.2021.30621579363163LMC-SMCA: A New Active Learning Method in ASRXiusong Sun0https://orcid.org/0000-0003-0232-7069Bo Wang1Shaohan Liu2https://orcid.org/0000-0001-6967-0262Tingxiang Lu3https://orcid.org/0000-0003-4906-7604Xin Shan4Qun Yang5https://orcid.org/0000-0001-6824-8473College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, ChinaNARI Group Corporation (State Grid Electric Power Research Institute), Nanjing, ChinaCollege of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, ChinaNARI Group Corporation (State Grid Electric Power Research Institute), Nanjing, ChinaNARI Group Corporation (State Grid Electric Power Research Institute), Nanjing, ChinaCollege of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, ChinaIn Automatic Speech Recognition (ASR), transcribed data take substantial effort to obtain. It is worthwhile to explore how to selective the samples with more information from un-transcribed datapool to get a better model with the limited cost. Therefore, active learning in ASR becomes a research topic. In this manuscript, we proposed two new methods of active learning. One is Signal-Model Committee Approach (SMCA) and the other is LM-based Certainty Approach (LMCA). These two methods respectively evaluate the information amount of samples from different angles and can be applied together for joint sampling in some scenarios. We conducted many comparative experiments on Listen, Attend and Spell (LAS) model according to different demands. In experiments, we compared our approach with the random sampling and another state-of-the-art committee-based approach: heterogeneous neural networks (HNN) based approach. We examined our approach in CER in Chinese Mandarin speech recognition task. The results show that proposed approach is not only simple to use, but also has the best performance.https://ieeexplore.ieee.org/document/9363163/Speech recognitionactive learningcommittee-basedcertainty-based methods
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Xiusong Sun Bo Wang Shaohan Liu Tingxiang Lu Xin Shan Qun Yang
spellingShingle	Xiusong Sun Bo Wang Shaohan Liu Tingxiang Lu Xin Shan Qun Yang LMC-SMCA: A New Active Learning Method in ASR IEEE Access Speech recognition active learning committee-based certainty-based methods
author_facet	Xiusong Sun Bo Wang Shaohan Liu Tingxiang Lu Xin Shan Qun Yang
author_sort	Xiusong Sun
title	LMC-SMCA: A New Active Learning Method in ASR
title_short	LMC-SMCA: A New Active Learning Method in ASR
title_full	LMC-SMCA: A New Active Learning Method in ASR
title_fullStr	LMC-SMCA: A New Active Learning Method in ASR
title_full_unstemmed	LMC-SMCA: A New Active Learning Method in ASR
title_sort	lmc-smca: a new active learning method in asr
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2021-01-01
description	In Automatic Speech Recognition (ASR), transcribed data take substantial effort to obtain. It is worthwhile to explore how to selective the samples with more information from un-transcribed datapool to get a better model with the limited cost. Therefore, active learning in ASR becomes a research topic. In this manuscript, we proposed two new methods of active learning. One is Signal-Model Committee Approach (SMCA) and the other is LM-based Certainty Approach (LMCA). These two methods respectively evaluate the information amount of samples from different angles and can be applied together for joint sampling in some scenarios. We conducted many comparative experiments on Listen, Attend and Spell (LAS) model according to different demands. In experiments, we compared our approach with the random sampling and another state-of-the-art committee-based approach: heterogeneous neural networks (HNN) based approach. We examined our approach in CER in Chinese Mandarin speech recognition task. The results show that proposed approach is not only simple to use, but also has the best performance.
topic	Speech recognition active learning committee-based certainty-based methods
url	https://ieeexplore.ieee.org/document/9363163/
work_keys_str_mv	AT xiusongsun lmcsmcaanewactivelearningmethodinasr AT bowang lmcsmcaanewactivelearningmethodinasr AT shaohanliu lmcsmcaanewactivelearningmethodinasr AT tingxianglu lmcsmcaanewactivelearningmethodinasr AT xinshan lmcsmcaanewactivelearningmethodinasr AT qunyang lmcsmcaanewactivelearningmethodinasr
_version_	1724179324719857664

LMC-SMCA: A New Active Learning Method in ASR

Similar Items