FEATURES OF NON-LOCAL SEMANTIC LINKS IN RUSSIAN TEXTS
Subject of Research. One of the ways of automatic text analysis is the construction of subordination trees, in which the words of a sentence are connected with each other by semantic-syntactic links. The field of research is Russian-language texts, which have a general political, artistic and highly...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Saint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University)
2018-07-01
|
Series: | Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki |
Subjects: | |
Online Access: | https://ntv.ifmo.ru/file/article/18152.pdf |
id |
doaj-14b836b552fd40279518465a4fabe807 |
---|---|
record_format |
Article |
spelling |
doaj-14b836b552fd40279518465a4fabe8072020-11-25T01:29:38ZengSaint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University)Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki2226-14942226-14942018-07-0118586386910.17586/2226-1494-2018-18-5-863-869FEATURES OF NON-LOCAL SEMANTIC LINKS IN RUSSIAN TEXTS Boyarsky K.KKanevsky E.ASubject of Research. One of the ways of automatic text analysis is the construction of subordination trees, in which the words of a sentence are connected with each other by semantic-syntactic links. The field of research is Russian-language texts, which have a general political, artistic and highly specialized character. Special attention is paid to the cases when the words are connected being far from each other at a considerable distance. Method. The subordination trees were built with the help of semantic-syntactical parser.Then the calculation of the distribution of links of different types by lengths was performed. The appearance frequencies of nonlocal links are studied. Main Results. It is shown that the fraction of non-local connections depending on the type can reach up to tens of percent. This is especially important for links coming from predicate nodes (subject, adverbial, etc.), as well as for anaphoric ones. It is noted that publicly available semantic classifiers and thesaurus have limited applicability for solving the problem of correct linking of remoted words in a sentence. Practical Relevance. It is shown that when solving the problem of extracting information that is ontological or scenario-based, as well as coreference, the long syntactic links that form the non-local semantic context cannot be neglected. The conclusion is drawn that the analysis of n-grams only is insufficient for the adequate selection of information from the text that is ontological or scenario. In this regard, there is a need to compile micro-dictionaries, focused on certain syntactic structureshttps://ntv.ifmo.ru/file/article/18152.pdfsemantic-syntactical analysissyntactical linkssubordination treen-gramscoreference |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Boyarsky K.K Kanevsky E.A |
spellingShingle |
Boyarsky K.K Kanevsky E.A FEATURES OF NON-LOCAL SEMANTIC LINKS IN RUSSIAN TEXTS Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki semantic-syntactical analysis syntactical links subordination tree n-grams coreference |
author_facet |
Boyarsky K.K Kanevsky E.A |
author_sort |
Boyarsky K.K |
title |
FEATURES OF NON-LOCAL SEMANTIC LINKS IN RUSSIAN TEXTS |
title_short |
FEATURES OF NON-LOCAL SEMANTIC LINKS IN RUSSIAN TEXTS |
title_full |
FEATURES OF NON-LOCAL SEMANTIC LINKS IN RUSSIAN TEXTS |
title_fullStr |
FEATURES OF NON-LOCAL SEMANTIC LINKS IN RUSSIAN TEXTS |
title_full_unstemmed |
FEATURES OF NON-LOCAL SEMANTIC LINKS IN RUSSIAN TEXTS |
title_sort |
features of non-local semantic links in russian texts |
publisher |
Saint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University) |
series |
Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki |
issn |
2226-1494 2226-1494 |
publishDate |
2018-07-01 |
description |
Subject of Research. One of the ways of automatic text analysis is the construction of subordination trees, in which the words of a sentence are connected with each other by semantic-syntactic links. The field of research is Russian-language texts, which have a general political, artistic and highly specialized character. Special attention is paid to the cases when the words are connected being far from each other at a considerable distance. Method. The subordination trees were built with the help of semantic-syntactical parser.Then the calculation of the distribution of links of different types by lengths was performed. The appearance frequencies of nonlocal links are studied. Main Results. It is shown that the fraction of non-local connections depending on the type can reach up to tens of percent. This is especially important for links coming from predicate nodes (subject, adverbial, etc.), as well as for anaphoric ones. It is noted that publicly available semantic classifiers and thesaurus have limited applicability for solving the problem of correct linking of remoted words in a sentence. Practical Relevance. It is shown that when solving the problem of extracting information that is ontological or scenario-based, as well as coreference, the long syntactic links that form the non-local semantic context cannot be neglected. The conclusion is drawn that the analysis of n-grams only is insufficient for the adequate selection of information from the text that is ontological or scenario. In this regard, there is a need to compile micro-dictionaries, focused on certain syntactic structures |
topic |
semantic-syntactical analysis syntactical links subordination tree n-grams coreference |
url |
https://ntv.ifmo.ru/file/article/18152.pdf |
work_keys_str_mv |
AT boyarskykk featuresofnonlocalsemanticlinksinrussiantexts AT kanevskyea featuresofnonlocalsemanticlinksinrussiantexts |
_version_ |
1725095945516875776 |