Entropy-Based Approach for the Detection of Changes in Arabic Newspapers’ Content

A new method for the recognition of meaningful changes in social state based on transformations of the linguistic content in Arabic newspapers is suggested. The detected alterations of the linguistic material in Arabic newspapers play an indicator role. The currently proposed approach acts in an “on...

Full description

Bibliographic Details
Main Authors: Olga Bernikova, Oleg Granichin, Dan Lemberg, Oleg Redkin, Zeev Volkovich
Format: Article
Language:English
Published: MDPI AG 2020-04-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/22/4/441
id doaj-654497a77c6944c6a3a6921570d15111
record_format Article
spelling doaj-654497a77c6944c6a3a6921570d151112020-11-25T03:10:56ZengMDPI AGEntropy1099-43002020-04-012244144110.3390/e22040441Entropy-Based Approach for the Detection of Changes in Arabic Newspapers’ ContentOlga Bernikova0Oleg Granichin1Dan Lemberg2Oleg Redkin3Zeev Volkovich4Research Laboratory for Analysis and Modeling of Social Processes, Saint Petersburg State University, Universitetskaya nab. 7-9, Saint Petersburg 190000, RussiaFaculty of Mathematics and Mechanics, and Research Laboratory for Analysis and Modeling of Social Processes, Saint Petersburg State University, Universitetsky prospekt 28, Saint Petersburg 198504, RussiaSoftware Engineering Department, ORT Braude College of Engineering, Karmiel 21982, IsraelResearch Laboratory for Analysis and Modeling of Social Processes, Saint Petersburg State University, Universitetskaya nab. 7-9, Saint Petersburg 190000, RussiaSoftware Engineering Department, ORT Braude College of Engineering, Karmiel 21982, IsraelA new method for the recognition of meaningful changes in social state based on transformations of the linguistic content in Arabic newspapers is suggested. The detected alterations of the linguistic material in Arabic newspapers play an indicator role. The currently proposed approach acts in an “online” fashion and uses pre-trained vector representations of Arabic words. After a pre-processing stage, the words in the issues’ texts are substituted by vectors obtained within a word embedding methodology. The approach typifies the consistent linguistic template by the similarity of the embedded vectors. A change in the distributions of the issue-grounded samples indicates a difference in the underlying newspaper template. A two-step procedure implements the concept, where the first step compares the similarity distribution of the current issue versus the union of ones corresponding to several of its predecessors. A repeating under-sampling approach accompanied by a two-sample test stabilizes the sampling and returns a collection of the resultant <i>p</i>-values. In the second stage, the entropy of these sets is sequentially calculated, such that the change points of the time series obtained in this way indicate the changes in the newspaper content. Numerical experiments provided on the following issues of several Arabic newspapers published in the Arab Spring period demonstrate the high reliability of the method.https://www.mdpi.com/1099-4300/22/4/441publishing model modelinganomaly detectionword embedding
collection DOAJ
language English
format Article
sources DOAJ
author Olga Bernikova
Oleg Granichin
Dan Lemberg
Oleg Redkin
Zeev Volkovich
spellingShingle Olga Bernikova
Oleg Granichin
Dan Lemberg
Oleg Redkin
Zeev Volkovich
Entropy-Based Approach for the Detection of Changes in Arabic Newspapers’ Content
Entropy
publishing model modeling
anomaly detection
word embedding
author_facet Olga Bernikova
Oleg Granichin
Dan Lemberg
Oleg Redkin
Zeev Volkovich
author_sort Olga Bernikova
title Entropy-Based Approach for the Detection of Changes in Arabic Newspapers’ Content
title_short Entropy-Based Approach for the Detection of Changes in Arabic Newspapers’ Content
title_full Entropy-Based Approach for the Detection of Changes in Arabic Newspapers’ Content
title_fullStr Entropy-Based Approach for the Detection of Changes in Arabic Newspapers’ Content
title_full_unstemmed Entropy-Based Approach for the Detection of Changes in Arabic Newspapers’ Content
title_sort entropy-based approach for the detection of changes in arabic newspapers’ content
publisher MDPI AG
series Entropy
issn 1099-4300
publishDate 2020-04-01
description A new method for the recognition of meaningful changes in social state based on transformations of the linguistic content in Arabic newspapers is suggested. The detected alterations of the linguistic material in Arabic newspapers play an indicator role. The currently proposed approach acts in an “online” fashion and uses pre-trained vector representations of Arabic words. After a pre-processing stage, the words in the issues’ texts are substituted by vectors obtained within a word embedding methodology. The approach typifies the consistent linguistic template by the similarity of the embedded vectors. A change in the distributions of the issue-grounded samples indicates a difference in the underlying newspaper template. A two-step procedure implements the concept, where the first step compares the similarity distribution of the current issue versus the union of ones corresponding to several of its predecessors. A repeating under-sampling approach accompanied by a two-sample test stabilizes the sampling and returns a collection of the resultant <i>p</i>-values. In the second stage, the entropy of these sets is sequentially calculated, such that the change points of the time series obtained in this way indicate the changes in the newspaper content. Numerical experiments provided on the following issues of several Arabic newspapers published in the Arab Spring period demonstrate the high reliability of the method.
topic publishing model modeling
anomaly detection
word embedding
url https://www.mdpi.com/1099-4300/22/4/441
work_keys_str_mv AT olgabernikova entropybasedapproachforthedetectionofchangesinarabicnewspaperscontent
AT oleggranichin entropybasedapproachforthedetectionofchangesinarabicnewspaperscontent
AT danlemberg entropybasedapproachforthedetectionofchangesinarabicnewspaperscontent
AT olegredkin entropybasedapproachforthedetectionofchangesinarabicnewspaperscontent
AT zeevvolkovich entropybasedapproachforthedetectionofchangesinarabicnewspaperscontent
_version_ 1724656306384535552