Starved neural learning : Morpheme segmentation using low amounts of data

Automatic morpheme segmentation as a field has been dominated by unsupervised methods since its inception. Partly due to theoretical motivations, but also due to resource constraints. Given the success neural network methods have shown on a wide variety of field in later years, it would seem compell...

Full description

Bibliographic Details
Main Author: Persson, Peter
Format: Others
Language:English
Published: Stockholms universitet, Avdelningen för datorlingvistik 2018
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-160953
id ndltd-UPSALLA1-oai-DiVA.org-su-160953
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-su-1609532018-10-13T06:14:40ZStarved neural learning : Morpheme segmentation using low amounts of dataengMorfemsegmentering med neurala nätverk med små mängder dataPersson, PeterStockholms universitet, Avdelningen för datorlingvistik2018morpheme segmentationmachine learningneural networksconvolutional neural networksLSTMGeneral Language Studies and LinguisticsJämförande språkvetenskap och allmän lingvistikAutomatic morpheme segmentation as a field has been dominated by unsupervised methods since its inception. Partly due to theoretical motivations, but also due to resource constraints. Given the success neural network methods have shown on a wide variety of field in later years, it would seem compelling to apply these methods to the morpheme segmentation field. This study explores the efficacy of modern neural networks, specifically convolutional neural networks and Bi-directional LSTM networks, on the morpheme segmentation task in a resource low setting to determine their viability as contenders with previous unsupervised, minimally supervised, and semi-supervised systems in the field. One architecture of each type is implemented and trained on a new gold standard data set and the results are compared to previously established methods. A qualitative error analysis of the architectures’ segmentations is also performed. The study demonstrates that a BLSTM system can be trained with minimal effort to produce a proof of concept solution at low levels of training data and suggests that BLSTM methods may be a fruitful direction for further research in this field. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-160953application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic morpheme segmentation
machine learning
neural networks
convolutional neural networks
LSTM
General Language Studies and Linguistics
Jämförande språkvetenskap och allmän lingvistik
spellingShingle morpheme segmentation
machine learning
neural networks
convolutional neural networks
LSTM
General Language Studies and Linguistics
Jämförande språkvetenskap och allmän lingvistik
Persson, Peter
Starved neural learning : Morpheme segmentation using low amounts of data
description Automatic morpheme segmentation as a field has been dominated by unsupervised methods since its inception. Partly due to theoretical motivations, but also due to resource constraints. Given the success neural network methods have shown on a wide variety of field in later years, it would seem compelling to apply these methods to the morpheme segmentation field. This study explores the efficacy of modern neural networks, specifically convolutional neural networks and Bi-directional LSTM networks, on the morpheme segmentation task in a resource low setting to determine their viability as contenders with previous unsupervised, minimally supervised, and semi-supervised systems in the field. One architecture of each type is implemented and trained on a new gold standard data set and the results are compared to previously established methods. A qualitative error analysis of the architectures’ segmentations is also performed. The study demonstrates that a BLSTM system can be trained with minimal effort to produce a proof of concept solution at low levels of training data and suggests that BLSTM methods may be a fruitful direction for further research in this field.
author Persson, Peter
author_facet Persson, Peter
author_sort Persson, Peter
title Starved neural learning : Morpheme segmentation using low amounts of data
title_short Starved neural learning : Morpheme segmentation using low amounts of data
title_full Starved neural learning : Morpheme segmentation using low amounts of data
title_fullStr Starved neural learning : Morpheme segmentation using low amounts of data
title_full_unstemmed Starved neural learning : Morpheme segmentation using low amounts of data
title_sort starved neural learning : morpheme segmentation using low amounts of data
publisher Stockholms universitet, Avdelningen för datorlingvistik
publishDate 2018
url http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-160953
work_keys_str_mv AT perssonpeter starvedneurallearningmorphemesegmentationusinglowamountsofdata
AT perssonpeter morfemsegmenteringmedneuralanatverkmedsmamangderdata
_version_ 1718773295146336256