Web News Mining Using New Features: A Comparative Study

Web-based applications are a well-known platform to exchange information between Internet-users. However, in this modern world, the processing of huge information or Big-Data such as web news or web advertisement of product information through users is the main challenge. In another side, such web a...

Full description

Bibliographic Details
Main Author: Halgurd S. Maghdid
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8594593/
Description
Summary:Web-based applications are a well-known platform to exchange information between Internet-users. However, in this modern world, the processing of huge information or Big-Data such as web news or web advertisement of product information through users is the main challenge. In another side, such web applications are the most accessible media for users to get up-to-date information. Equally, these applications need huge computation in terms of spaces and times as well as they drain the battery power of the users’ mobile devices. Therefore, one of the solutions to mitigate these challenges is to mine or extract specific information based on specific features. Furthermore, the features will be the users’ behavior or retrieved information from different sources. This article aims at designing and carrying out a web application to extract news information using new features such as geolocation and time information as well as showing a comparative study on three different mining techniques. The application can run on different devices including Laptops, Smartphones, and Tablets. Moreover, the application can retrieve information features accordingly. Then, the obtained information could be used as a basis for starting or as input for the data-mining techniques, including K-Nearest-Neighbor (k-NN), decision tree and deep-learning recurrent neural network (such as Long Short-Term Memory ‘LSTM’). These techniques are separately implemented and they are compared in terms of time/space complexity and classification accuracy. The obtained results show that the mining accuracy via k-NN is the worst one (~85%) and takes much more time, while the mining accuracy through using LSTM is the best one and its accuracy is around (~94%), when location information is used.
ISSN:2169-3536