Discretisation of conditions in decision rules induced for continuous data.

Typically discretisation procedures are implemented as a part of initial pre-processing of data, before knowledge mining is employed. It means that conclusions and observations are based on reduced data, as usually by discretisation some information is discarded. The paper presents a different appro...

Full description

Bibliographic Details
Main Authors:	Urszula Stańczyk, Beata Zielosko, Grzegorz Baron
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2020-01-01
Series:	PLoS ONE
Online Access:	https://doi.org/10.1371/journal.pone.0231788

id	doaj-58d2dc3c77454d0a99e09222dcd3e677
record_format	Article
spelling	doaj-58d2dc3c77454d0a99e09222dcd3e6772021-03-03T21:41:41ZengPublic Library of Science (PLoS)PLoS ONE1932-62032020-01-01154e023178810.1371/journal.pone.0231788Discretisation of conditions in decision rules induced for continuous data.Urszula StańczykBeata ZieloskoGrzegorz BaronTypically discretisation procedures are implemented as a part of initial pre-processing of data, before knowledge mining is employed. It means that conclusions and observations are based on reduced data, as usually by discretisation some information is discarded. The paper presents a different approach, with taking advantage of discretisation executed after data mining. In the described study firstly decision rules were induced from real-valued features. Secondly, data sets were discretised. Using categories found for attributes, in the third step conditions included in inferred rules were translated into discrete domain. The properties and performance of rule classifiers were tested in the domain of stylometric analysis of texts, where writing styles were defined through quantitative attributes of continuous nature. The performed experiments show that the proposed processing leads to sets of rules with significantly reduced sizes while maintaining quality of predictions, and allows to test many data discretisation methods at the acceptable computational costs.https://doi.org/10.1371/journal.pone.0231788
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Urszula Stańczyk Beata Zielosko Grzegorz Baron
spellingShingle	Urszula Stańczyk Beata Zielosko Grzegorz Baron Discretisation of conditions in decision rules induced for continuous data. PLoS ONE
author_facet	Urszula Stańczyk Beata Zielosko Grzegorz Baron
author_sort	Urszula Stańczyk
title	Discretisation of conditions in decision rules induced for continuous data.
title_short	Discretisation of conditions in decision rules induced for continuous data.
title_full	Discretisation of conditions in decision rules induced for continuous data.
title_fullStr	Discretisation of conditions in decision rules induced for continuous data.
title_full_unstemmed	Discretisation of conditions in decision rules induced for continuous data.
title_sort	discretisation of conditions in decision rules induced for continuous data.
publisher	Public Library of Science (PLoS)
series	PLoS ONE
issn	1932-6203
publishDate	2020-01-01
description	Typically discretisation procedures are implemented as a part of initial pre-processing of data, before knowledge mining is employed. It means that conclusions and observations are based on reduced data, as usually by discretisation some information is discarded. The paper presents a different approach, with taking advantage of discretisation executed after data mining. In the described study firstly decision rules were induced from real-valued features. Secondly, data sets were discretised. Using categories found for attributes, in the third step conditions included in inferred rules were translated into discrete domain. The properties and performance of rule classifiers were tested in the domain of stylometric analysis of texts, where writing styles were defined through quantitative attributes of continuous nature. The performed experiments show that the proposed processing leads to sets of rules with significantly reduced sizes while maintaining quality of predictions, and allows to test many data discretisation methods at the acceptable computational costs.
url	https://doi.org/10.1371/journal.pone.0231788
work_keys_str_mv	AT urszulastanczyk discretisationofconditionsindecisionrulesinducedforcontinuousdata AT beatazielosko discretisationofconditionsindecisionrulesinducedforcontinuousdata AT grzegorzbaron discretisationofconditionsindecisionrulesinducedforcontinuousdata
_version_	1714815630966784000

Discretisation of conditions in decision rules induced for continuous data.

Similar Items