Mix Multiple Features to Evaluate the Content and the Linguistic Quality of Text Summaries
In this article, we propose a method of text summary's content and linguistic quality evaluation that is based on a machine learning approach. This method operates by combining multiple features to build predictive models that evaluate the content and the linguistic quality of new summaries (un...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
University of Zagreb Faculty of Electrical Engineering and Computing
2017-01-01
|
Series: | Journal of Computing and Information Technology |
Online Access: | http://hrcak.srce.hr/file/270305 |
id |
doaj-b592f2cf7731468ab64e672d0e0425f4 |
---|---|
record_format |
Article |
spelling |
doaj-b592f2cf7731468ab64e672d0e0425f42020-11-24T23:11:58ZengUniversity of Zagreb Faculty of Electrical Engineering and ComputingJournal of Computing and Information Technology1330-11361846-39082017-01-0125214916610.20532/cit.2017.1003398183330Mix Multiple Features to Evaluate the Content and the Linguistic Quality of Text SummariesSamira Ellouze0Maher Jaoua1Lamia Hadrich Belguith2University of Sfax, Faculty of Economics and Management of Sfax, Sfax, TunisiaUniversity of Sfax, Faculty of Economics and Management of Sfax, Sfax, TunisiaUniversity of Sfax, Faculty of Economics and Management of Sfax, Sfax, TunisiaIn this article, we propose a method of text summary's content and linguistic quality evaluation that is based on a machine learning approach. This method operates by combining multiple features to build predictive models that evaluate the content and the linguistic quality of new summaries (unseen) constructed from the same source documents as the summaries used in the training and the validation of models. To obtain the best model, many single and ensemble learning classifiers are tested. Using the constructed models, we have achieved a good performance in predicting the content and the linguistic quality scores. In order to evaluate the summarization systems, we calculated the system score as the average of the score of summaries that are built from the same system. Then, we evaluated the correlation of the system score with the manual system score. The obtained correlation indicates that the system score outperforms the baseline scores.http://hrcak.srce.hr/file/270305 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Samira Ellouze Maher Jaoua Lamia Hadrich Belguith |
spellingShingle |
Samira Ellouze Maher Jaoua Lamia Hadrich Belguith Mix Multiple Features to Evaluate the Content and the Linguistic Quality of Text Summaries Journal of Computing and Information Technology |
author_facet |
Samira Ellouze Maher Jaoua Lamia Hadrich Belguith |
author_sort |
Samira Ellouze |
title |
Mix Multiple Features to Evaluate the Content and the Linguistic Quality of Text Summaries |
title_short |
Mix Multiple Features to Evaluate the Content and the Linguistic Quality of Text Summaries |
title_full |
Mix Multiple Features to Evaluate the Content and the Linguistic Quality of Text Summaries |
title_fullStr |
Mix Multiple Features to Evaluate the Content and the Linguistic Quality of Text Summaries |
title_full_unstemmed |
Mix Multiple Features to Evaluate the Content and the Linguistic Quality of Text Summaries |
title_sort |
mix multiple features to evaluate the content and the linguistic quality of text summaries |
publisher |
University of Zagreb Faculty of Electrical Engineering and Computing |
series |
Journal of Computing and Information Technology |
issn |
1330-1136 1846-3908 |
publishDate |
2017-01-01 |
description |
In this article, we propose a method of text summary's content and linguistic quality evaluation that is based on a machine learning approach. This method operates by combining multiple features to build predictive models that evaluate the content and the linguistic quality of new summaries (unseen) constructed from the same source documents as the summaries used in the training and the validation of models. To obtain the best model, many single and ensemble learning classifiers are tested. Using the constructed models, we have achieved a good performance in predicting the content and the linguistic quality scores. In order to evaluate the summarization systems, we calculated the system score as the average of the score of summaries that are built from the same system. Then, we evaluated the correlation of the system score with the manual system score. The obtained correlation indicates that the system score outperforms the baseline scores. |
url |
http://hrcak.srce.hr/file/270305 |
work_keys_str_mv |
AT samiraellouze mixmultiplefeaturestoevaluatethecontentandthelinguisticqualityoftextsummaries AT maherjaoua mixmultiplefeaturestoevaluatethecontentandthelinguisticqualityoftextsummaries AT lamiahadrichbelguith mixmultiplefeaturestoevaluatethecontentandthelinguisticqualityoftextsummaries |
_version_ |
1725603063779033088 |