When Collective Knowledge Meets Crowd Knowledge in a Smart City: A Prediction Method Combining Open Data Keyword Analysis and Case-Based Reasoning

One of the significant issues in a smart city is maintaining a healthy environment. To improve the environment, huge amounts of data are gathered, manipulated, analyzed, and utilized, and these data might include noise, uncertainty, or unexpected mistreatment of the data. In some datasets, the class...

Full description

Bibliographic Details
Main Authors: Ohbyung Kwon, Yun Seon Kim, Namyeon Lee, Yuchul Jung
Format: Article
Language:English
Published: Hindawi Limited 2018-01-01
Series:Journal of Healthcare Engineering
Online Access:http://dx.doi.org/10.1155/2018/7391793
Description
Summary:One of the significant issues in a smart city is maintaining a healthy environment. To improve the environment, huge amounts of data are gathered, manipulated, analyzed, and utilized, and these data might include noise, uncertainty, or unexpected mistreatment of the data. In some datasets, the class imbalance problem skews the learning performance of the classification algorithms. In this paper, we propose a case-based reasoning method that combines the use of crowd knowledge from open source data and collective knowledge. This method mitigates the class imbalance issues resulting from datasets, which diagnose wellness levels in patients suffering from stress or depression. We investigate effective ways to mitigate class imbalance issues in which the datasets have a higher proportion of one class over another. The results of this proposed hybrid reasoning method, using a combination of crowd knowledge extracted from open source data (i.e., a Google search, or other publicly accessible source) and collective knowledge (i.e., case-based reasoning), were that it performs better than other traditional methods (e.g., SMO, BayesNet, IBk, Logistic, C4.5, and crowd reasoning). We also demonstrate that the use of open source and big data improves the classification performance when used in addition to conventional classification algorithms.
ISSN:2040-2295
2040-2309