Customers’ Opinion Mining from Extensive Amount of Textual Reviews in Relation to Induced Knowledge Growth

Customers of various services are often invited to type a summarizing review via an Internet portal. Such reviews, written in natural languages, are typically unstructured, giving also a numeric evaluation within the scale “good” and “bad.” The more reviews, the better feedback can be acquired for i...

Full description

Bibliographic Details
Main Authors:	Jan Žižka, Arnošt Svoboda
Format:	Article
Language:	English
Published:	Mendel University Press 2015-01-01
Series:	Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis
Subjects:	text mining customer opinion analysis decision trees decision rules windowing large data volumes
Online Access:	https://acta.mendelu.cz/63/6/2229/

id	doaj-fe6b2cdedf14476da1508d2a508ea61a
record_format	Article
spelling	doaj-fe6b2cdedf14476da1508d2a508ea61a2020-11-25T00:18:28ZengMendel University PressActa Universitatis Agriculturae et Silviculturae Mendelianae Brunensis1211-85162464-83102015-01-016362229223710.11118/actaun201563062229Customers’ Opinion Mining from Extensive Amount of Textual Reviews in Relation to Induced Knowledge GrowthJan Žižka0Arnošt Svoboda1Department of Informatics, Faculty of Business and Economics, Mendel University in Brno, Zemědělská 1, 613 00 Brno, Czech RepublicDepartment of Applied Mathematics and Computer Science, Faculty of Economics and Administration, Masaryk University, Žerotínovo nám. 617/9, 601 77 Brno, Czech RepublicCustomers of various services are often invited to type a summarizing review via an Internet portal. Such reviews, written in natural languages, are typically unstructured, giving also a numeric evaluation within the scale “good” and “bad.” The more reviews, the better feedback can be acquired for improving the service. However, after accumulating massive data, the non-linearly growing processing complexity may exceed the computational abilities to analyze the text contents. Decision tree inducers like c5 can reveal understandable knowledge from data but they need the data as a whole. This article describes an application of windowing, which is a technique for generating dataset subsamples that provide enough information for an inducer to train a classifier and get results similar to those achieved by training a model from the entire dataset. The windowing results, significantly reducing the complexity of the learning problem, are demonstrated using hundreds of thousands reviews written in English by hotel-service customers. A user obtains knowledge represented by significant words. The results show classification accuracy errors, training and testing time, tree sizes, and words relevant for the review meaning in dependence on the training subsample size. Finally, a method of suitable training-set size estimation is suggested.https://acta.mendelu.cz/63/6/2229/text miningcustomer opinion analysisdecision treesdecision ruleswindowinglarge data volumes
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Jan Žižka Arnošt Svoboda
spellingShingle	Jan Žižka Arnošt Svoboda Customers’ Opinion Mining from Extensive Amount of Textual Reviews in Relation to Induced Knowledge Growth Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis text mining customer opinion analysis decision trees decision rules windowing large data volumes
author_facet	Jan Žižka Arnošt Svoboda
author_sort	Jan Žižka
title	Customers’ Opinion Mining from Extensive Amount of Textual Reviews in Relation to Induced Knowledge Growth
title_short	Customers’ Opinion Mining from Extensive Amount of Textual Reviews in Relation to Induced Knowledge Growth
title_full	Customers’ Opinion Mining from Extensive Amount of Textual Reviews in Relation to Induced Knowledge Growth
title_fullStr	Customers’ Opinion Mining from Extensive Amount of Textual Reviews in Relation to Induced Knowledge Growth
title_full_unstemmed	Customers’ Opinion Mining from Extensive Amount of Textual Reviews in Relation to Induced Knowledge Growth
title_sort	customers’ opinion mining from extensive amount of textual reviews in relation to induced knowledge growth
publisher	Mendel University Press
series	Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis
issn	1211-8516 2464-8310
publishDate	2015-01-01
description	Customers of various services are often invited to type a summarizing review via an Internet portal. Such reviews, written in natural languages, are typically unstructured, giving also a numeric evaluation within the scale “good” and “bad.” The more reviews, the better feedback can be acquired for improving the service. However, after accumulating massive data, the non-linearly growing processing complexity may exceed the computational abilities to analyze the text contents. Decision tree inducers like c5 can reveal understandable knowledge from data but they need the data as a whole. This article describes an application of windowing, which is a technique for generating dataset subsamples that provide enough information for an inducer to train a classifier and get results similar to those achieved by training a model from the entire dataset. The windowing results, significantly reducing the complexity of the learning problem, are demonstrated using hundreds of thousands reviews written in English by hotel-service customers. A user obtains knowledge represented by significant words. The results show classification accuracy errors, training and testing time, tree sizes, and words relevant for the review meaning in dependence on the training subsample size. Finally, a method of suitable training-set size estimation is suggested.
topic	text mining customer opinion analysis decision trees decision rules windowing large data volumes
url	https://acta.mendelu.cz/63/6/2229/
work_keys_str_mv	AT janzizka customersopinionminingfromextensiveamountoftextualreviewsinrelationtoinducedknowledgegrowth AT arnostsvoboda customersopinionminingfromextensiveamountoftextualreviewsinrelationtoinducedknowledgegrowth
_version_	1725376407525130240

Customers’ Opinion Mining from Extensive Amount of Textual Reviews in Relation to Induced Knowledge Growth

Similar Items