Using the probability of readability to order Swedish texts

In this study we present a new approach to rank readability in Swedish texts based on lexical, morpho-syntactic and syntactic analysis of text as well as machine learning. The basic premise and theory is presented as well as a small experiment testing the feasibility, but not actual performance, of...

Full description

Bibliographic Details
Main Authors: Falkenjack, Johan, Heimann Mühlenbock, Katarina
Format: Others
Language:English
Published: Linköpings universitet, Interaktiva och kognitiva system 2012
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-93371
id ndltd-UPSALLA1-oai-DiVA.org-liu-93371
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-liu-933712013-06-01T16:20:58ZUsing the probability of readability to order Swedish textsengFalkenjack, JohanHeimann Mühlenbock, KatarinaLinköpings universitet, Interaktiva och kognitiva systemLinköpings universitet, Tekniska högskolanSanta Anna IT Research Institute AB, Linköping, SwedenSpråkbanken, University of Gothenburg, Gothenburg2012In this study we present a new approach to rank readability in Swedish texts based on lexical, morpho-syntactic and syntactic analysis of text as well as machine learning. The basic premise and theory is presented as well as a small experiment testing the feasibility, but not actual performance, of the approach. The experiment shows that it is possible to implement a system based on the approach, however, the actual performance of such a system has not been evaluated as the necessary resources for such an evaluation does not yet exist for Swedish. The experiment also shows that a classifier based on the aforementioned linguistic analysis, on our limited test set, outperforms classifiers based on established metrics used to assess readability such as LIX, OVIX and Nominal Ratio. Conference paperinfo:eu-repo/semantics/conferenceObjecttexthttp://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-93371Proceedings of the Fourth Swedish Language Technology Conference, p. 27-28application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
description In this study we present a new approach to rank readability in Swedish texts based on lexical, morpho-syntactic and syntactic analysis of text as well as machine learning. The basic premise and theory is presented as well as a small experiment testing the feasibility, but not actual performance, of the approach. The experiment shows that it is possible to implement a system based on the approach, however, the actual performance of such a system has not been evaluated as the necessary resources for such an evaluation does not yet exist for Swedish. The experiment also shows that a classifier based on the aforementioned linguistic analysis, on our limited test set, outperforms classifiers based on established metrics used to assess readability such as LIX, OVIX and Nominal Ratio.
author Falkenjack, Johan
Heimann Mühlenbock, Katarina
spellingShingle Falkenjack, Johan
Heimann Mühlenbock, Katarina
Using the probability of readability to order Swedish texts
author_facet Falkenjack, Johan
Heimann Mühlenbock, Katarina
author_sort Falkenjack, Johan
title Using the probability of readability to order Swedish texts
title_short Using the probability of readability to order Swedish texts
title_full Using the probability of readability to order Swedish texts
title_fullStr Using the probability of readability to order Swedish texts
title_full_unstemmed Using the probability of readability to order Swedish texts
title_sort using the probability of readability to order swedish texts
publisher Linköpings universitet, Interaktiva och kognitiva system
publishDate 2012
url http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-93371
work_keys_str_mv AT falkenjackjohan usingtheprobabilityofreadabilitytoorderswedishtexts
AT heimannmuhlenbockkatarina usingtheprobabilityofreadabilitytoorderswedishtexts
_version_ 1716586588485451776