Disease Prediction by Machine Learning Over Big Data From Healthcare Communities

With big data growth in biomedical and healthcare communities, accurate analysis of medical data benefits early disease detection, patient care, and community services. However, the analysis accuracy is reduced when the quality of medical data is incomplete. Moreover, different regions exhibit uniqu...

Full description

Bibliographic Details
Main Authors: Min Chen, Yixue Hao, Kai Hwang, Lu Wang, Lin Wang
Format: Article
Language:English
Published: IEEE 2017-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/7912315/
id doaj-25be7db3263d4838b401b3bfdafd8074
record_format Article
spelling doaj-25be7db3263d4838b401b3bfdafd80742021-03-29T20:07:21ZengIEEEIEEE Access2169-35362017-01-0158869887910.1109/ACCESS.2017.26944467912315Disease Prediction by Machine Learning Over Big Data From Healthcare CommunitiesMin Chen0https://orcid.org/0000-0002-0960-4447Yixue Hao1Kai Hwang2https://orcid.org/0000-0003-2673-4953Lu Wang3Lin Wang4School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, ChinaSchool of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, ChinaUniversity of Southern California, Los Angeles, CA, USASchool of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, ChinaResearch Center for Tissue Engineering and Regenerative Medicine, Huazhong University of Science and Technology, Wuhan, ChinaWith big data growth in biomedical and healthcare communities, accurate analysis of medical data benefits early disease detection, patient care, and community services. However, the analysis accuracy is reduced when the quality of medical data is incomplete. Moreover, different regions exhibit unique characteristics of certain regional diseases, which may weaken the prediction of disease outbreaks. In this paper, we streamline machine learning algorithms for effective prediction of chronic disease outbreak in disease-frequent communities. We experiment the modified prediction models over real-life hospital data collected from central China in 2013-2015. To overcome the difficulty of incomplete data, we use a latent factor model to reconstruct the missing data. We experiment on a regional chronic disease of cerebral infarction. We propose a new convolutional neural network (CNN)-based multimodal disease risk prediction algorithm using structured and unstructured data from hospital. To the best of our knowledge, none of the existing work focused on both data types in the area of medical big data analytics. Compared with several typical prediction algorithms, the prediction accuracy of our proposed algorithm reaches 94.8% with a convergence speed, which is faster than that of the CNN-based unimodal disease risk prediction algorithm.https://ieeexplore.ieee.org/document/7912315/Big data analyticsmachine learninghealthcare
collection DOAJ
language English
format Article
sources DOAJ
author Min Chen
Yixue Hao
Kai Hwang
Lu Wang
Lin Wang
spellingShingle Min Chen
Yixue Hao
Kai Hwang
Lu Wang
Lin Wang
Disease Prediction by Machine Learning Over Big Data From Healthcare Communities
IEEE Access
Big data analytics
machine learning
healthcare
author_facet Min Chen
Yixue Hao
Kai Hwang
Lu Wang
Lin Wang
author_sort Min Chen
title Disease Prediction by Machine Learning Over Big Data From Healthcare Communities
title_short Disease Prediction by Machine Learning Over Big Data From Healthcare Communities
title_full Disease Prediction by Machine Learning Over Big Data From Healthcare Communities
title_fullStr Disease Prediction by Machine Learning Over Big Data From Healthcare Communities
title_full_unstemmed Disease Prediction by Machine Learning Over Big Data From Healthcare Communities
title_sort disease prediction by machine learning over big data from healthcare communities
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2017-01-01
description With big data growth in biomedical and healthcare communities, accurate analysis of medical data benefits early disease detection, patient care, and community services. However, the analysis accuracy is reduced when the quality of medical data is incomplete. Moreover, different regions exhibit unique characteristics of certain regional diseases, which may weaken the prediction of disease outbreaks. In this paper, we streamline machine learning algorithms for effective prediction of chronic disease outbreak in disease-frequent communities. We experiment the modified prediction models over real-life hospital data collected from central China in 2013-2015. To overcome the difficulty of incomplete data, we use a latent factor model to reconstruct the missing data. We experiment on a regional chronic disease of cerebral infarction. We propose a new convolutional neural network (CNN)-based multimodal disease risk prediction algorithm using structured and unstructured data from hospital. To the best of our knowledge, none of the existing work focused on both data types in the area of medical big data analytics. Compared with several typical prediction algorithms, the prediction accuracy of our proposed algorithm reaches 94.8% with a convergence speed, which is faster than that of the CNN-based unimodal disease risk prediction algorithm.
topic Big data analytics
machine learning
healthcare
url https://ieeexplore.ieee.org/document/7912315/
work_keys_str_mv AT minchen diseasepredictionbymachinelearningoverbigdatafromhealthcarecommunities
AT yixuehao diseasepredictionbymachinelearningoverbigdatafromhealthcarecommunities
AT kaihwang diseasepredictionbymachinelearningoverbigdatafromhealthcarecommunities
AT luwang diseasepredictionbymachinelearningoverbigdatafromhealthcarecommunities
AT linwang diseasepredictionbymachinelearningoverbigdatafromhealthcarecommunities
_version_ 1724195176398716928