Starved neural learning : Morpheme segmentation using low amounts of data
Automatic morpheme segmentation as a field has been dominated by unsupervised methods since its inception. Partly due to theoretical motivations, but also due to resource constraints. Given the success neural network methods have shown on a wide variety of field in later years, it would seem compell...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
Stockholms universitet, Avdelningen för datorlingvistik
2018
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-160953 |
id |
ndltd-UPSALLA1-oai-DiVA.org-su-160953 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-su-1609532018-10-13T06:14:40ZStarved neural learning : Morpheme segmentation using low amounts of dataengMorfemsegmentering med neurala nätverk med små mängder dataPersson, PeterStockholms universitet, Avdelningen för datorlingvistik2018morpheme segmentationmachine learningneural networksconvolutional neural networksLSTMGeneral Language Studies and LinguisticsJämförande språkvetenskap och allmän lingvistikAutomatic morpheme segmentation as a field has been dominated by unsupervised methods since its inception. Partly due to theoretical motivations, but also due to resource constraints. Given the success neural network methods have shown on a wide variety of field in later years, it would seem compelling to apply these methods to the morpheme segmentation field. This study explores the efficacy of modern neural networks, specifically convolutional neural networks and Bi-directional LSTM networks, on the morpheme segmentation task in a resource low setting to determine their viability as contenders with previous unsupervised, minimally supervised, and semi-supervised systems in the field. One architecture of each type is implemented and trained on a new gold standard data set and the results are compared to previously established methods. A qualitative error analysis of the architectures’ segmentations is also performed. The study demonstrates that a BLSTM system can be trained with minimal effort to produce a proof of concept solution at low levels of training data and suggests that BLSTM methods may be a fruitful direction for further research in this field. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-160953application/pdfinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
morpheme segmentation machine learning neural networks convolutional neural networks LSTM General Language Studies and Linguistics Jämförande språkvetenskap och allmän lingvistik |
spellingShingle |
morpheme segmentation machine learning neural networks convolutional neural networks LSTM General Language Studies and Linguistics Jämförande språkvetenskap och allmän lingvistik Persson, Peter Starved neural learning : Morpheme segmentation using low amounts of data |
description |
Automatic morpheme segmentation as a field has been dominated by unsupervised methods since its inception. Partly due to theoretical motivations, but also due to resource constraints. Given the success neural network methods have shown on a wide variety of field in later years, it would seem compelling to apply these methods to the morpheme segmentation field. This study explores the efficacy of modern neural networks, specifically convolutional neural networks and Bi-directional LSTM networks, on the morpheme segmentation task in a resource low setting to determine their viability as contenders with previous unsupervised, minimally supervised, and semi-supervised systems in the field. One architecture of each type is implemented and trained on a new gold standard data set and the results are compared to previously established methods. A qualitative error analysis of the architectures’ segmentations is also performed. The study demonstrates that a BLSTM system can be trained with minimal effort to produce a proof of concept solution at low levels of training data and suggests that BLSTM methods may be a fruitful direction for further research in this field. |
author |
Persson, Peter |
author_facet |
Persson, Peter |
author_sort |
Persson, Peter |
title |
Starved neural learning : Morpheme segmentation using low amounts of data |
title_short |
Starved neural learning : Morpheme segmentation using low amounts of data |
title_full |
Starved neural learning : Morpheme segmentation using low amounts of data |
title_fullStr |
Starved neural learning : Morpheme segmentation using low amounts of data |
title_full_unstemmed |
Starved neural learning : Morpheme segmentation using low amounts of data |
title_sort |
starved neural learning : morpheme segmentation using low amounts of data |
publisher |
Stockholms universitet, Avdelningen för datorlingvistik |
publishDate |
2018 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-160953 |
work_keys_str_mv |
AT perssonpeter starvedneurallearningmorphemesegmentationusinglowamountsofdata AT perssonpeter morfemsegmenteringmedneuralanatverkmedsmamangderdata |
_version_ |
1718773295146336256 |