A metatextual markup in the national corpus of Tuvan language: the structure and functionality
Creating natural language corpora helps solve a number of philological and purely linguistic problems for many languages of the peoples of Russian Federation. National corpus of Tuvan language (http://www.tuvancorpus.ru/) is one of such products jointly developed by faculty and students at two unive...
Main Author: | |
---|---|
Format: | Article |
Language: | Russian |
Published: |
Novye Issledovaniâ Tuvy
2016-12-01
|
Series: | Novye Issledovaniâ Tuvy |
Subjects: | |
Online Access: | https://nit.tuva.asia/nit/article/view/613 |
id |
doaj-5c713ef9ed9d42829656cc2c417dbde3 |
---|---|
record_format |
Article |
spelling |
doaj-5c713ef9ed9d42829656cc2c417dbde32020-11-24T22:12:27ZrusNovye Issledovaniâ Tuvy Novye Issledovaniâ Tuvy2079-84822016-12-0104606A metatextual markup in the national corpus of Tuvan language: the structure and functionalityChoduraa M. Mongush0Тувинский государственный университет, Сибирский федеральный университетCreating natural language corpora helps solve a number of philological and purely linguistic problems for many languages of the peoples of Russian Federation. National corpus of Tuvan language (http://www.tuvancorpus.ru/) is one of such products jointly developed by faculty and students at two universities in Krasnoyarsk and Kyzyl. The article presents a meta-markup system which forms the most important part of the search functionality in any corpus. Meta-markup refers to assigning parameters characterizing the text as a whole. Within a corpus, meta-markup provides the opportunity to search and select texts to include them into subcorpora by the presence of a certain feature(s). Consequently, the larger the set of such features is for each text, the wider become the search functionality for various philological and linguistic purposes. The meta-markup system for the texts included into the National corpus of Tuvan language may include up to 18 parameters, such as the author’s name and gender, the title and creation date (year) of the text, its functional sphere, topic, subject area, time and setting of events described in it, the text’s classification by type of spoken language or literary genre and style, its source, name of the periodical it appeared in, publisher, publication date, medium, comments, as well as some features of its audience, such as age and education level.https://nit.tuva.asia/nit/article/view/613корпусы естественных языковНациональный корпус тувинского языкаметаразметкатувинский языктувинский героический эпос |
collection |
DOAJ |
language |
Russian |
format |
Article |
sources |
DOAJ |
author |
Choduraa M. Mongush |
spellingShingle |
Choduraa M. Mongush A metatextual markup in the national corpus of Tuvan language: the structure and functionality Novye Issledovaniâ Tuvy корпусы естественных языков Национальный корпус тувинского языка метаразметка тувинский язык тувинский героический эпос |
author_facet |
Choduraa M. Mongush |
author_sort |
Choduraa M. Mongush |
title |
A metatextual markup in the national corpus of Tuvan language: the structure and functionality |
title_short |
A metatextual markup in the national corpus of Tuvan language: the structure and functionality |
title_full |
A metatextual markup in the national corpus of Tuvan language: the structure and functionality |
title_fullStr |
A metatextual markup in the national corpus of Tuvan language: the structure and functionality |
title_full_unstemmed |
A metatextual markup in the national corpus of Tuvan language: the structure and functionality |
title_sort |
metatextual markup in the national corpus of tuvan language: the structure and functionality |
publisher |
Novye Issledovaniâ Tuvy |
series |
Novye Issledovaniâ Tuvy |
issn |
2079-8482 |
publishDate |
2016-12-01 |
description |
Creating natural language corpora helps solve a number of philological and purely linguistic problems for many languages of the peoples of Russian Federation. National corpus of Tuvan language (http://www.tuvancorpus.ru/) is one of such products jointly developed by faculty and students at two universities in Krasnoyarsk and Kyzyl.
The article presents a meta-markup system which forms the most important part of the search functionality in any corpus. Meta-markup refers to assigning parameters characterizing the text as a whole. Within a corpus, meta-markup provides the opportunity to search and select texts to include them into subcorpora by the presence of a certain feature(s). Consequently, the larger the set of such features is for each text, the wider become the search functionality for various philological and linguistic purposes.
The meta-markup system for the texts included into the National corpus of Tuvan language may include up to 18 parameters, such as the author’s name and gender, the title and creation date (year) of the text, its functional sphere, topic, subject area, time and setting of events described in it, the text’s classification by type of spoken language or literary genre and style, its source, name of the periodical it appeared in, publisher, publication date, medium, comments, as well as some features of its audience, such as age and education level. |
topic |
корпусы естественных языков Национальный корпус тувинского языка метаразметка тувинский язык тувинский героический эпос |
url |
https://nit.tuva.asia/nit/article/view/613 |
work_keys_str_mv |
AT choduraammongush ametatextualmarkupinthenationalcorpusoftuvanlanguagethestructureandfunctionality AT choduraammongush metatextualmarkupinthenationalcorpusoftuvanlanguagethestructureandfunctionality |
_version_ |
1725803604613267456 |