Integrating Co-Clustering and Interpretable Machine Learning for the Prediction of Intravenous Immunoglobulin Resistance in Kawasaki Disease

Identifying intravenous immunoglobulin-resistant patients is essential for the prompt and optimal treatment of Kawasaki disease, suggesting the need for effective risk assessment tools. Data-driven approaches have the potential to identify the high-risk individuals by capturing the complex patterns...

Full description

Bibliographic Details
Main Authors: Haolin Wang, Zhilin Huang, Danfeng Zhang, Johan Arief, Tiewei Lyu, Jie Tian
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9097874/
id doaj-5bd382f7756348bb8f6ead8d76638c0d
record_format Article
spelling doaj-5bd382f7756348bb8f6ead8d76638c0d2021-03-30T02:16:48ZengIEEEIEEE Access2169-35362020-01-018970649707110.1109/ACCESS.2020.29963029097874Integrating Co-Clustering and Interpretable Machine Learning for the Prediction of Intravenous Immunoglobulin Resistance in Kawasaki DiseaseHaolin Wang0https://orcid.org/0000-0002-1735-9525Zhilin Huang1Danfeng Zhang2Johan Arief3Tiewei Lyu4Jie Tian5College of Medical Informatics, Chongqing Medical University, Chongqing, ChinaDepartment of Cardiology, Heart Centre, Children’s Hospital of Chongqing Medical University, Ministry of Education Key Laboratory of Child Development and Disorders, National Center for Clinical Medicine Research in Children’s Health and Disease, Chongqing, ChinaDepartment of Cardiology, Heart Centre, Children’s Hospital of Chongqing Medical University, Ministry of Education Key Laboratory of Child Development and Disorders, National Center for Clinical Medicine Research in Children’s Health and Disease, Chongqing, ChinaDepartment of Cardiology, Heart Centre, Children’s Hospital of Chongqing Medical University, Ministry of Education Key Laboratory of Child Development and Disorders, National Center for Clinical Medicine Research in Children’s Health and Disease, Chongqing, ChinaDepartment of Cardiology, Heart Centre, Children’s Hospital of Chongqing Medical University, Ministry of Education Key Laboratory of Child Development and Disorders, National Center for Clinical Medicine Research in Children’s Health and Disease, Chongqing, ChinaDepartment of Cardiology, Heart Centre, Children’s Hospital of Chongqing Medical University, Ministry of Education Key Laboratory of Child Development and Disorders, National Center for Clinical Medicine Research in Children’s Health and Disease, Chongqing, ChinaIdentifying intravenous immunoglobulin-resistant patients is essential for the prompt and optimal treatment of Kawasaki disease, suggesting the need for effective risk assessment tools. Data-driven approaches have the potential to identify the high-risk individuals by capturing the complex patterns of real-world data. To enable clinically applicable prediction of intravenous immunoglobulin resistance addressing the incompleteness of clinical data and the lack of interpretability of machine learning models, a multi-stage method is developed by integrating data missing pattern mining and intelligible models. First, co-clustering is adopted to characterize the block-wise data missing patterns by simultaneously grouping the clinical features and patients to enable (a) group-based feature selection and missing data imputation and (b) patient subgroup-specific predictive models considering the availability of data. Second, feature selection is performed using the group Lasso to uncover group-specific risk factors. Third, the Explainable Boosting Machine, which is an interpretable learning method based on generalized additive models, is applied for the prediction of each patient subgroup. The experiments using real-world Electronic Health Records demonstrate the superior performance of the proposed framework for predictive modeling compared with a set of benchmark methods. This study highlights the integration of co-clustering and supervised learning methods for incomplete clinical data mining, and promotes data-driven approaches to investigate predictors and effective algorithms for decision making in healthcare.https://ieeexplore.ieee.org/document/9097874/Co-clusteringinterpretable machine learningmedical informaticspredictive modelsKawasaki disease
collection DOAJ
language English
format Article
sources DOAJ
author Haolin Wang
Zhilin Huang
Danfeng Zhang
Johan Arief
Tiewei Lyu
Jie Tian
spellingShingle Haolin Wang
Zhilin Huang
Danfeng Zhang
Johan Arief
Tiewei Lyu
Jie Tian
Integrating Co-Clustering and Interpretable Machine Learning for the Prediction of Intravenous Immunoglobulin Resistance in Kawasaki Disease
IEEE Access
Co-clustering
interpretable machine learning
medical informatics
predictive models
Kawasaki disease
author_facet Haolin Wang
Zhilin Huang
Danfeng Zhang
Johan Arief
Tiewei Lyu
Jie Tian
author_sort Haolin Wang
title Integrating Co-Clustering and Interpretable Machine Learning for the Prediction of Intravenous Immunoglobulin Resistance in Kawasaki Disease
title_short Integrating Co-Clustering and Interpretable Machine Learning for the Prediction of Intravenous Immunoglobulin Resistance in Kawasaki Disease
title_full Integrating Co-Clustering and Interpretable Machine Learning for the Prediction of Intravenous Immunoglobulin Resistance in Kawasaki Disease
title_fullStr Integrating Co-Clustering and Interpretable Machine Learning for the Prediction of Intravenous Immunoglobulin Resistance in Kawasaki Disease
title_full_unstemmed Integrating Co-Clustering and Interpretable Machine Learning for the Prediction of Intravenous Immunoglobulin Resistance in Kawasaki Disease
title_sort integrating co-clustering and interpretable machine learning for the prediction of intravenous immunoglobulin resistance in kawasaki disease
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description Identifying intravenous immunoglobulin-resistant patients is essential for the prompt and optimal treatment of Kawasaki disease, suggesting the need for effective risk assessment tools. Data-driven approaches have the potential to identify the high-risk individuals by capturing the complex patterns of real-world data. To enable clinically applicable prediction of intravenous immunoglobulin resistance addressing the incompleteness of clinical data and the lack of interpretability of machine learning models, a multi-stage method is developed by integrating data missing pattern mining and intelligible models. First, co-clustering is adopted to characterize the block-wise data missing patterns by simultaneously grouping the clinical features and patients to enable (a) group-based feature selection and missing data imputation and (b) patient subgroup-specific predictive models considering the availability of data. Second, feature selection is performed using the group Lasso to uncover group-specific risk factors. Third, the Explainable Boosting Machine, which is an interpretable learning method based on generalized additive models, is applied for the prediction of each patient subgroup. The experiments using real-world Electronic Health Records demonstrate the superior performance of the proposed framework for predictive modeling compared with a set of benchmark methods. This study highlights the integration of co-clustering and supervised learning methods for incomplete clinical data mining, and promotes data-driven approaches to investigate predictors and effective algorithms for decision making in healthcare.
topic Co-clustering
interpretable machine learning
medical informatics
predictive models
Kawasaki disease
url https://ieeexplore.ieee.org/document/9097874/
work_keys_str_mv AT haolinwang integratingcoclusteringandinterpretablemachinelearningforthepredictionofintravenousimmunoglobulinresistanceinkawasakidisease
AT zhilinhuang integratingcoclusteringandinterpretablemachinelearningforthepredictionofintravenousimmunoglobulinresistanceinkawasakidisease
AT danfengzhang integratingcoclusteringandinterpretablemachinelearningforthepredictionofintravenousimmunoglobulinresistanceinkawasakidisease
AT johanarief integratingcoclusteringandinterpretablemachinelearningforthepredictionofintravenousimmunoglobulinresistanceinkawasakidisease
AT tieweilyu integratingcoclusteringandinterpretablemachinelearningforthepredictionofintravenousimmunoglobulinresistanceinkawasakidisease
AT jietian integratingcoclusteringandinterpretablemachinelearningforthepredictionofintravenousimmunoglobulinresistanceinkawasakidisease
_version_ 1724185519528607744