INTERNATIONAL SYSTEM OF KNOWLEDGE EXCHANGE FOR YOUNG SCIENTISTS
The paper proposes a system which is electronic data storage (of qualification works of students from different countries) and provides the capability to identify and connect young scientists conducting research on a related problem area. The purpose of developing this system is to provide opportun...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
National Technical University "Kharkiv Polytechnic Institute"
2021-02-01
|
Series: | Сучасні інформаційні системи |
Subjects: | |
Online Access: | http://ais.khpi.edu.ua/article/view/226836/226384 |
id |
doaj-c4027de2bf2a4c30a1f37336b3636b9c |
---|---|
record_format |
Article |
spelling |
doaj-c4027de2bf2a4c30a1f37336b3636b9c2021-05-18T05:55:05ZengNational Technical University "Kharkiv Polytechnic Institute"Сучасні інформаційні системи2522-90522021-02-0151697410.20998/2522-9052.2021.1.09INTERNATIONAL SYSTEM OF KNOWLEDGE EXCHANGE FOR YOUNG SCIENTISTSOlesia Barkovska0https://orcid.org/0000-0001-7496-4353Vladyslav Kholiev1https://orcid.org/0000-0002-9148-1561Georgiy Ivaschenko2https://orcid.org/0000-0003-1027-5262Dmytro Rosinskiy3https://orcid.org/0000-0002-0725-392XKharkiv National University of RadioElectronicsKharkiv National University of RadioElectronicsKharkiv National University of RadioElectronicsKharkiv National University of RadioElectronicsThe paper proposes a system which is electronic data storage (of qualification works of students from different countries) and provides the capability to identify and connect young scientists conducting research on a related problem area. The purpose of developing this system is to provide opportunities for knowledge exchange, research in a team on a common problem, as well as to identify scientific trends in different countries. In this paper, the preprocessing methods influence on the work of classifiers such as Logistic Regression, LSTM, BERT, LightGBM was researched. A study was conducted on the speed of classification and F1 assessment. Conclusions. Lemmatization showed to require a shorter operating time compared to steaming by almost twice and a better score by an average of 5 percent, so it was decided to use the Logistic Regression classifier with lemmatization at the stage of text preparation in the subsequent operation of the proposed ISKE.http://ais.khpi.edu.ua/article/view/226836/226384systemnlptextprocessingaccelerationshinglesproximitylikenessclassificationpreprocessinglemmatizationstemming |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Olesia Barkovska Vladyslav Kholiev Georgiy Ivaschenko Dmytro Rosinskiy |
spellingShingle |
Olesia Barkovska Vladyslav Kholiev Georgiy Ivaschenko Dmytro Rosinskiy INTERNATIONAL SYSTEM OF KNOWLEDGE EXCHANGE FOR YOUNG SCIENTISTS Сучасні інформаційні системи system nlp text processing acceleration shingles proximity likeness classification preprocessing lemmatization stemming |
author_facet |
Olesia Barkovska Vladyslav Kholiev Georgiy Ivaschenko Dmytro Rosinskiy |
author_sort |
Olesia Barkovska |
title |
INTERNATIONAL SYSTEM OF KNOWLEDGE EXCHANGE FOR YOUNG SCIENTISTS |
title_short |
INTERNATIONAL SYSTEM OF KNOWLEDGE EXCHANGE FOR YOUNG SCIENTISTS |
title_full |
INTERNATIONAL SYSTEM OF KNOWLEDGE EXCHANGE FOR YOUNG SCIENTISTS |
title_fullStr |
INTERNATIONAL SYSTEM OF KNOWLEDGE EXCHANGE FOR YOUNG SCIENTISTS |
title_full_unstemmed |
INTERNATIONAL SYSTEM OF KNOWLEDGE EXCHANGE FOR YOUNG SCIENTISTS |
title_sort |
international system of knowledge exchange for young scientists |
publisher |
National Technical University "Kharkiv Polytechnic Institute" |
series |
Сучасні інформаційні системи |
issn |
2522-9052 |
publishDate |
2021-02-01 |
description |
The paper proposes a system which is electronic data storage (of qualification works of students from different countries) and provides the capability to identify and connect young scientists conducting research on a
related problem area. The purpose of developing this system is to provide opportunities for knowledge exchange, research in a team on a common problem, as well as to identify scientific trends in different countries. In this paper, the preprocessing methods influence on the work of classifiers such as Logistic Regression, LSTM, BERT, LightGBM was researched. A study was conducted on the speed of classification and F1 assessment. Conclusions. Lemmatization showed to require a shorter operating time compared to steaming by almost twice and a better score by an average of 5 percent, so it was decided to use the Logistic Regression classifier with lemmatization at the stage of text preparation in the subsequent operation of the proposed ISKE. |
topic |
system nlp text processing acceleration shingles proximity likeness classification preprocessing lemmatization stemming |
url |
http://ais.khpi.edu.ua/article/view/226836/226384 |
work_keys_str_mv |
AT olesiabarkovska internationalsystemofknowledgeexchangeforyoungscientists AT vladyslavkholiev internationalsystemofknowledgeexchangeforyoungscientists AT georgiyivaschenko internationalsystemofknowledgeexchangeforyoungscientists AT dmytrorosinskiy internationalsystemofknowledgeexchangeforyoungscientists |
_version_ |
1721437674021060608 |