ALGORITHMIC AND PROGRAM IMPLEMENTATION OF THE PLAGIARISM DEFINITION IN LEARNING MANAGEMENT SYSTEMS

The main advantage of using information technologies in education, which consists in speeding up and simplifying of information exchange, is also its drawback, because it raises the problem of plagiarism. The purpose of this paper is to develop testing text software for uniqueness in learning manage...

Full description

Bibliographic Details
Main Authors:	Y. B. Popova, A. V. Goloburda
Format:	Article
Language:	English
Published:	Belarusian National Technical University 2018-06-01
Series:	Sistemnyj Analiz i Prikladnaâ Informatika
Subjects:	plagiarism vector document model terms n-list similarity matrix cluster cluster analysis
Online Access:	https://sapi.bntu.by/jour/article/view/206

id	doaj-c66debf8d50a439a947ca2af0a0f5ab4
record_format	Article
spelling	doaj-c66debf8d50a439a947ca2af0a0f5ab42021-07-29T08:38:33ZengBelarusian National Technical UniversitySistemnyj Analiz i Prikladnaâ Informatika2309-49232414-04812018-06-0101717810.21122/2309-4923-2018-1-71-78159ALGORITHMIC AND PROGRAM IMPLEMENTATION OF THE PLAGIARISM DEFINITION IN LEARNING MANAGEMENT SYSTEMSY. B. Popova0A. V. Goloburda1Belarusian National Technical UniversityBelarusian National Technical UniversityThe main advantage of using information technologies in education, which consists in speeding up and simplifying of information exchange, is also its drawback, because it raises the problem of plagiarism. The purpose of this paper is to develop testing text software for uniqueness in learning management systems. To achieve this goal, it is necessary to solve a range of problems related to the choice of a method for determining plagiarism, its algorithmization and software implementation. The work deals with the methods of shingles, super-shingles, signature methods, vector models of text representation, as well as cluster analysis of text information. The authors suggest a modification of the vector model to improve the accuracy of determining similar documents by creating an N-list of each document separately. As a result, a pairwise comparison of the documents and the formation of the image of one document relative to the N-list of the other will occur. Thus, in the i-th row of the similarity matrix, the coefficients of similarity of all the documents considered relative to the i-th document will be recorded. The proposed modification will also speed up the calculation process, since there is no need to search for common terms for all documents. To analyze a large number of student’s works in order to test them for plagiarism, the authors propose using a cluster approach. Its application showed that the time for determining duplicates for one document and for all documents included in the sample is the same. For the same time it is possible to get all the options for the same works of students. Thus, the use of cluster analysis of text information in determining plagiarism significantly saves both the teacher’s time and computing resources. The software implementation of the proposed algorithms is implemented as a web service in the Java language.https://sapi.bntu.by/jour/article/view/206plagiarismvector document modeltermsn-listsimilarity matrixclustercluster analysis
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Y. B. Popova A. V. Goloburda
spellingShingle	Y. B. Popova A. V. Goloburda ALGORITHMIC AND PROGRAM IMPLEMENTATION OF THE PLAGIARISM DEFINITION IN LEARNING MANAGEMENT SYSTEMS Sistemnyj Analiz i Prikladnaâ Informatika plagiarism vector document model terms n-list similarity matrix cluster cluster analysis
author_facet	Y. B. Popova A. V. Goloburda
author_sort	Y. B. Popova
title	ALGORITHMIC AND PROGRAM IMPLEMENTATION OF THE PLAGIARISM DEFINITION IN LEARNING MANAGEMENT SYSTEMS
title_short	ALGORITHMIC AND PROGRAM IMPLEMENTATION OF THE PLAGIARISM DEFINITION IN LEARNING MANAGEMENT SYSTEMS
title_full	ALGORITHMIC AND PROGRAM IMPLEMENTATION OF THE PLAGIARISM DEFINITION IN LEARNING MANAGEMENT SYSTEMS
title_fullStr	ALGORITHMIC AND PROGRAM IMPLEMENTATION OF THE PLAGIARISM DEFINITION IN LEARNING MANAGEMENT SYSTEMS
title_full_unstemmed	ALGORITHMIC AND PROGRAM IMPLEMENTATION OF THE PLAGIARISM DEFINITION IN LEARNING MANAGEMENT SYSTEMS
title_sort	algorithmic and program implementation of the plagiarism definition in learning management systems
publisher	Belarusian National Technical University
series	Sistemnyj Analiz i Prikladnaâ Informatika
issn	2309-4923 2414-0481
publishDate	2018-06-01
description	The main advantage of using information technologies in education, which consists in speeding up and simplifying of information exchange, is also its drawback, because it raises the problem of plagiarism. The purpose of this paper is to develop testing text software for uniqueness in learning management systems. To achieve this goal, it is necessary to solve a range of problems related to the choice of a method for determining plagiarism, its algorithmization and software implementation. The work deals with the methods of shingles, super-shingles, signature methods, vector models of text representation, as well as cluster analysis of text information. The authors suggest a modification of the vector model to improve the accuracy of determining similar documents by creating an N-list of each document separately. As a result, a pairwise comparison of the documents and the formation of the image of one document relative to the N-list of the other will occur. Thus, in the i-th row of the similarity matrix, the coefficients of similarity of all the documents considered relative to the i-th document will be recorded. The proposed modification will also speed up the calculation process, since there is no need to search for common terms for all documents. To analyze a large number of student’s works in order to test them for plagiarism, the authors propose using a cluster approach. Its application showed that the time for determining duplicates for one document and for all documents included in the sample is the same. For the same time it is possible to get all the options for the same works of students. Thus, the use of cluster analysis of text information in determining plagiarism significantly saves both the teacher’s time and computing resources. The software implementation of the proposed algorithms is implemented as a web service in the Java language.
topic	plagiarism vector document model terms n-list similarity matrix cluster cluster analysis
url	https://sapi.bntu.by/jour/article/view/206
work_keys_str_mv	AT ybpopova algorithmicandprogramimplementationoftheplagiarismdefinitioninlearningmanagementsystems AT avgoloburda algorithmicandprogramimplementationoftheplagiarismdefinitioninlearningmanagementsystems
_version_	1721253076589871104

ALGORITHMIC AND PROGRAM IMPLEMENTATION OF THE PLAGIARISM DEFINITION IN LEARNING MANAGEMENT SYSTEMS

Similar Items