Semi-Supervised Learning with Sparse Autoencoders in Automatic Speech Recognition

This work is aimed at exploring semi-supervised learning techniques to improve the performance of Automatic Speech Recognition systems. Semi-supervised learning takes advantage of unlabeled data in order to improve the quality of the representations extracted from the data.The proposed model is a ne...

Full description

Bibliographic Details
Main Author:	DHAKA, AKASH KUMAR
Format:	Others
Language:	English
Published:	KTH, Skolan för datavetenskap och kommunikation (CSC) 2016
Subjects:	machine learning automatic speech recognition semi supervised learning Computer Sciences Datavetenskap (datalogi)
Online Access:	http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-197628

id	ndltd-UPSALLA1-oai-DiVA.org-kth-197628
record_format	oai_dc
spelling	ndltd-UPSALLA1-oai-DiVA.org-kth-1976282018-01-14T05:11:47ZSemi-Supervised Learning with Sparse Autoencoders in Automatic Speech RecognitionengSemi-övervakad inlärning med glesa autoencoders i automatisk taligenkänningDHAKA, AKASH KUMARKTH, Skolan för datavetenskap och kommunikation (CSC)2016machine learningautomatic speech recognitionsemi supervised learningComputer SciencesDatavetenskap (datalogi)This work is aimed at exploring semi-supervised learning techniques to improve the performance of Automatic Speech Recognition systems. Semi-supervised learning takes advantage of unlabeled data in order to improve the quality of the representations extracted from the data.The proposed model is a neural network where the weights are updated by minimizing the weighted sum of a supervised and an unsupervised cost function, simultaneously. These costs are evaluated on the labeled and unlabeled portions of the data set, respectively. The combined cost is optimized through mini-batch stochastic gradient descent via standard backpropagation.The model was tested on a phone classification task on the TIMIT American English data set and on a written digit classification task on the MNIST data set. Our results show that the model outperforms a network trained with standard backpropagation on the labelled material alone. The results are also in line with state-of-the-art graph-based semi-supervised training methods. Detta arbete syftar till att utforska halvövervakade inlärningstekniker (semi-supervised learning techniques) för att förbättra prestandan hos automatiska taligenkänningssystem.Halvövervakad maskininlärning använder sig av data ej märkt med klasstillhörighetsinformation för att förbättra kvaliteten hos den från datan extraherade representationen.Modellen som beskrivs i arbetet är ett neuralt nätverk där vikterna uppdateras genom att samtidigt minimera den viktade summan av en övervakad och en oövervakad kostnadsfunktion.Dessa kostnadsfunktioner evalueras på den märkta respektive den omärkta datamängden.De kombinerade kostnadsfunktionerna optimeras genom gradient descent med hjälp av traditionell backpropagation.Modellen har evaluerats genom en fonklassificeringsuppgift på datamängden TIMIT American English, samt en sifferklassificeringsuppgift på datamängden MNIST.Resultaten visar att modellen presterar bättre än ett nätverk tränat med backpropagation på endast märkt data.Resultaten är även konkurrenskraftiga med rådande state of the art, grafbaserade halvövervakade inlärningsmetoder. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-197628EES Examensarbete / Master Thesisapplication/pdfinfo:eu-repo/semantics/openAccess
collection	NDLTD
language	English
format	Others
sources	NDLTD
topic	machine learning automatic speech recognition semi supervised learning Computer Sciences Datavetenskap (datalogi)
spellingShingle	machine learning automatic speech recognition semi supervised learning Computer Sciences Datavetenskap (datalogi) DHAKA, AKASH KUMAR Semi-Supervised Learning with Sparse Autoencoders in Automatic Speech Recognition
description	This work is aimed at exploring semi-supervised learning techniques to improve the performance of Automatic Speech Recognition systems. Semi-supervised learning takes advantage of unlabeled data in order to improve the quality of the representations extracted from the data.The proposed model is a neural network where the weights are updated by minimizing the weighted sum of a supervised and an unsupervised cost function, simultaneously. These costs are evaluated on the labeled and unlabeled portions of the data set, respectively. The combined cost is optimized through mini-batch stochastic gradient descent via standard backpropagation.The model was tested on a phone classification task on the TIMIT American English data set and on a written digit classification task on the MNIST data set. Our results show that the model outperforms a network trained with standard backpropagation on the labelled material alone. The results are also in line with state-of-the-art graph-based semi-supervised training methods. === Detta arbete syftar till att utforska halvövervakade inlärningstekniker (semi-supervised learning techniques) för att förbättra prestandan hos automatiska taligenkänningssystem.Halvövervakad maskininlärning använder sig av data ej märkt med klasstillhörighetsinformation för att förbättra kvaliteten hos den från datan extraherade representationen.Modellen som beskrivs i arbetet är ett neuralt nätverk där vikterna uppdateras genom att samtidigt minimera den viktade summan av en övervakad och en oövervakad kostnadsfunktion.Dessa kostnadsfunktioner evalueras på den märkta respektive den omärkta datamängden.De kombinerade kostnadsfunktionerna optimeras genom gradient descent med hjälp av traditionell backpropagation.Modellen har evaluerats genom en fonklassificeringsuppgift på datamängden TIMIT American English, samt en sifferklassificeringsuppgift på datamängden MNIST.Resultaten visar att modellen presterar bättre än ett nätverk tränat med backpropagation på endast märkt data.Resultaten är även konkurrenskraftiga med rådande state of the art, grafbaserade halvövervakade inlärningsmetoder.
author	DHAKA, AKASH KUMAR
author_facet	DHAKA, AKASH KUMAR
author_sort	DHAKA, AKASH KUMAR
title	Semi-Supervised Learning with Sparse Autoencoders in Automatic Speech Recognition
title_short	Semi-Supervised Learning with Sparse Autoencoders in Automatic Speech Recognition
title_full	Semi-Supervised Learning with Sparse Autoencoders in Automatic Speech Recognition
title_fullStr	Semi-Supervised Learning with Sparse Autoencoders in Automatic Speech Recognition
title_full_unstemmed	Semi-Supervised Learning with Sparse Autoencoders in Automatic Speech Recognition
title_sort	semi-supervised learning with sparse autoencoders in automatic speech recognition
publisher	KTH, Skolan för datavetenskap och kommunikation (CSC)
publishDate	2016
url	http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-197628
work_keys_str_mv	AT dhakaakashkumar semisupervisedlearningwithsparseautoencodersinautomaticspeechrecognition AT dhakaakashkumar semiovervakadinlarningmedglesaautoencodersiautomatisktaligenkanning
_version_	1718609723614298112

Semi-Supervised Learning with Sparse Autoencoders in Automatic Speech Recognition

Similar Items