Website Clickstream Data Visualization Using Improved Markov Chain Modelling In Apache Flume

Clickstream data analysis is considered as the process of collecting, analysing and reporting the aggregate data about the web pages a visitor clicks. Visualizing the clickstream data has gained significant importance in many applications like web marketing, customer prediction, product management,...

Full description

Bibliographic Details
Main Author: Frhan Amjad Jumaah
Format: Article
Language:English
Published: EDP Sciences 2017-01-01
Series:MATEC Web of Conferences
Subjects:
Online Access:https://doi.org/10.1051/matecconf/201712504025
id doaj-f37e0cb72efd46788e0cee715425b69d
record_format Article
spelling doaj-f37e0cb72efd46788e0cee715425b69d2021-02-02T03:39:01ZengEDP SciencesMATEC Web of Conferences2261-236X2017-01-011250402510.1051/matecconf/201712504025matecconf_cscc2017_04025Website Clickstream Data Visualization Using Improved Markov Chain Modelling In Apache FlumeFrhan Amjad JumaahClickstream data analysis is considered as the process of collecting, analysing and reporting the aggregate data about the web pages a visitor clicks. Visualizing the clickstream data has gained significant importance in many applications like web marketing, customer prediction, product management, etc. Most existing works employ different tools for visualizing along with techniques like Markov chain modelling. However the accuracy of the methods can be improved when the shortcomings are resolved. Markov chain modelling has problems of occlusion and unable to provide clear display of data visualizing. These issues can be resolved by improving the Markov chain model by introducing a heuristic method of Kolmogorov– Smirnov distance and maximum likelihood estimator for visualizing. These concepts are employed between the underlying distribution states to minimize the Markov distribution. The proposed model named as WebClickviz is performed in Hadoop Apache Flume which is a highly advanced tool. The clickstream data visualization accuracy can be improved when Apache Flume tools are used. The performance evaluation are made on a specific website clickstream data which shows the proposed model of visualization has better performance than existing models like VizClick.https://doi.org/10.1051/matecconf/201712504025Clickstream dataVizClickWebClickvizApache FlumeMarkov chainKolmogorov-Smirnov distance
collection DOAJ
language English
format Article
sources DOAJ
author Frhan Amjad Jumaah
spellingShingle Frhan Amjad Jumaah
Website Clickstream Data Visualization Using Improved Markov Chain Modelling In Apache Flume
MATEC Web of Conferences
Clickstream data
VizClick
WebClickviz
Apache Flume
Markov chain
Kolmogorov-Smirnov distance
author_facet Frhan Amjad Jumaah
author_sort Frhan Amjad Jumaah
title Website Clickstream Data Visualization Using Improved Markov Chain Modelling In Apache Flume
title_short Website Clickstream Data Visualization Using Improved Markov Chain Modelling In Apache Flume
title_full Website Clickstream Data Visualization Using Improved Markov Chain Modelling In Apache Flume
title_fullStr Website Clickstream Data Visualization Using Improved Markov Chain Modelling In Apache Flume
title_full_unstemmed Website Clickstream Data Visualization Using Improved Markov Chain Modelling In Apache Flume
title_sort website clickstream data visualization using improved markov chain modelling in apache flume
publisher EDP Sciences
series MATEC Web of Conferences
issn 2261-236X
publishDate 2017-01-01
description Clickstream data analysis is considered as the process of collecting, analysing and reporting the aggregate data about the web pages a visitor clicks. Visualizing the clickstream data has gained significant importance in many applications like web marketing, customer prediction, product management, etc. Most existing works employ different tools for visualizing along with techniques like Markov chain modelling. However the accuracy of the methods can be improved when the shortcomings are resolved. Markov chain modelling has problems of occlusion and unable to provide clear display of data visualizing. These issues can be resolved by improving the Markov chain model by introducing a heuristic method of Kolmogorov– Smirnov distance and maximum likelihood estimator for visualizing. These concepts are employed between the underlying distribution states to minimize the Markov distribution. The proposed model named as WebClickviz is performed in Hadoop Apache Flume which is a highly advanced tool. The clickstream data visualization accuracy can be improved when Apache Flume tools are used. The performance evaluation are made on a specific website clickstream data which shows the proposed model of visualization has better performance than existing models like VizClick.
topic Clickstream data
VizClick
WebClickviz
Apache Flume
Markov chain
Kolmogorov-Smirnov distance
url https://doi.org/10.1051/matecconf/201712504025
work_keys_str_mv AT frhanamjadjumaah websiteclickstreamdatavisualizationusingimprovedmarkovchainmodellinginapacheflume
_version_ 1724307390675812352