Predicting Future High-Cost Schizophrenia Patients Using High-Dimensional Administrative Data

BackgroundThe burden of serious and persistent mental illness such as schizophrenia is substantial and requires health-care organizations to have adequate risk adjustment models to effectively allocate their resources to managing patients who are at the greatest risk. Currently available models unde...

Full description

Bibliographic Details
Main Authors: Yajuan Wang, Vijay Iyengar, Jianying Hu, David Kho, Erin Falconer, John P. Docherty, Gigi Y. Yuen
Format: Article
Language:English
Published: Frontiers Media S.A. 2017-06-01
Series:Frontiers in Psychiatry
Subjects:
Online Access:http://journal.frontiersin.org/article/10.3389/fpsyt.2017.00114/full
id doaj-c1c1e2839815494db99e4488221bd0e8
record_format Article
spelling doaj-c1c1e2839815494db99e4488221bd0e82020-11-24T22:48:17ZengFrontiers Media S.A.Frontiers in Psychiatry1664-06402017-06-01810.3389/fpsyt.2017.00114259661Predicting Future High-Cost Schizophrenia Patients Using High-Dimensional Administrative DataYajuan Wang0Vijay Iyengar1Jianying Hu2David Kho3Erin Falconer4John P. Docherty5Gigi Y. Yuen6Innovation and Foundational Technology, IBM Watson Health, Yorktown Heights, NY, United StatesIBM T.J. Watson Research Center, Yorktown Heights, NY, United StatesIBM T.J. Watson Research Center, Yorktown Heights, NY, United StatesMedical Strategy, ODH, Inc., Princeton, NJ, United StatesMedical Strategy, ODH, Inc., Princeton, NJ, United StatesMedical Strategy, ODH, Inc., Princeton, NJ, United StatesInnovation and Foundational Technology, IBM Watson Health, Yorktown Heights, NY, United StatesBackgroundThe burden of serious and persistent mental illness such as schizophrenia is substantial and requires health-care organizations to have adequate risk adjustment models to effectively allocate their resources to managing patients who are at the greatest risk. Currently available models underestimate health-care costs for those with mental or behavioral health conditions.ObjectivesThe study aimed to develop and evaluate predictive models for identification of future high-cost schizophrenia patients using advanced supervised machine learning methods.MethodsThis was a retrospective study using a payer administrative database. The study cohort consisted of 97,862 patients diagnosed with schizophrenia (ICD9 code 295.*) from January 2009 to June 2014. Training (n = 34,510) and study evaluation (n = 30,077) cohorts were derived based on 12-month observation and prediction windows (PWs). The target was average total cost/patient/month in the PW. Three models (baseline, intermediate, final) were developed to assess the value of different variable categories for cost prediction (demographics, coverage, cost, health-care utilization, antipsychotic medication usage, and clinical conditions). Scalable orthogonal regression, significant attribute selection in high dimensions method, and random forests regression were used to develop the models. The trained models were assessed in the evaluation cohort using the regression R2, patient classification accuracy (PCA), and cost accuracy (CA). The model performance was compared to the Centers for Medicare & Medicaid Services Hierarchical Condition Categories (CMS-HCC) model.ResultsAt top 10% cost cutoff, the final model achieved 0.23 R2, 43% PCA, and 63% CA; in contrast, the CMS-HCC model achieved 0.09 R2, 27% PCA with 45% CA. The final model and the CMS-HCC model identified 33 and 22%, respectively, of total cost at the top 10% cost cutoff.ConclusionUsing advanced feature selection leveraging detailed health care, medication utilization features, and supervised machine learning methods improved the ability to predict and identify future high-cost patients with schizophrenia when compared with the CMS-HCC model.http://journal.frontiersin.org/article/10.3389/fpsyt.2017.00114/fullhealth-care costmachine learningfeature selectionmodel selectionschizophrenia
collection DOAJ
language English
format Article
sources DOAJ
author Yajuan Wang
Vijay Iyengar
Jianying Hu
David Kho
Erin Falconer
John P. Docherty
Gigi Y. Yuen
spellingShingle Yajuan Wang
Vijay Iyengar
Jianying Hu
David Kho
Erin Falconer
John P. Docherty
Gigi Y. Yuen
Predicting Future High-Cost Schizophrenia Patients Using High-Dimensional Administrative Data
Frontiers in Psychiatry
health-care cost
machine learning
feature selection
model selection
schizophrenia
author_facet Yajuan Wang
Vijay Iyengar
Jianying Hu
David Kho
Erin Falconer
John P. Docherty
Gigi Y. Yuen
author_sort Yajuan Wang
title Predicting Future High-Cost Schizophrenia Patients Using High-Dimensional Administrative Data
title_short Predicting Future High-Cost Schizophrenia Patients Using High-Dimensional Administrative Data
title_full Predicting Future High-Cost Schizophrenia Patients Using High-Dimensional Administrative Data
title_fullStr Predicting Future High-Cost Schizophrenia Patients Using High-Dimensional Administrative Data
title_full_unstemmed Predicting Future High-Cost Schizophrenia Patients Using High-Dimensional Administrative Data
title_sort predicting future high-cost schizophrenia patients using high-dimensional administrative data
publisher Frontiers Media S.A.
series Frontiers in Psychiatry
issn 1664-0640
publishDate 2017-06-01
description BackgroundThe burden of serious and persistent mental illness such as schizophrenia is substantial and requires health-care organizations to have adequate risk adjustment models to effectively allocate their resources to managing patients who are at the greatest risk. Currently available models underestimate health-care costs for those with mental or behavioral health conditions.ObjectivesThe study aimed to develop and evaluate predictive models for identification of future high-cost schizophrenia patients using advanced supervised machine learning methods.MethodsThis was a retrospective study using a payer administrative database. The study cohort consisted of 97,862 patients diagnosed with schizophrenia (ICD9 code 295.*) from January 2009 to June 2014. Training (n = 34,510) and study evaluation (n = 30,077) cohorts were derived based on 12-month observation and prediction windows (PWs). The target was average total cost/patient/month in the PW. Three models (baseline, intermediate, final) were developed to assess the value of different variable categories for cost prediction (demographics, coverage, cost, health-care utilization, antipsychotic medication usage, and clinical conditions). Scalable orthogonal regression, significant attribute selection in high dimensions method, and random forests regression were used to develop the models. The trained models were assessed in the evaluation cohort using the regression R2, patient classification accuracy (PCA), and cost accuracy (CA). The model performance was compared to the Centers for Medicare & Medicaid Services Hierarchical Condition Categories (CMS-HCC) model.ResultsAt top 10% cost cutoff, the final model achieved 0.23 R2, 43% PCA, and 63% CA; in contrast, the CMS-HCC model achieved 0.09 R2, 27% PCA with 45% CA. The final model and the CMS-HCC model identified 33 and 22%, respectively, of total cost at the top 10% cost cutoff.ConclusionUsing advanced feature selection leveraging detailed health care, medication utilization features, and supervised machine learning methods improved the ability to predict and identify future high-cost patients with schizophrenia when compared with the CMS-HCC model.
topic health-care cost
machine learning
feature selection
model selection
schizophrenia
url http://journal.frontiersin.org/article/10.3389/fpsyt.2017.00114/full
work_keys_str_mv AT yajuanwang predictingfuturehighcostschizophreniapatientsusinghighdimensionaladministrativedata
AT vijayiyengar predictingfuturehighcostschizophreniapatientsusinghighdimensionaladministrativedata
AT jianyinghu predictingfuturehighcostschizophreniapatientsusinghighdimensionaladministrativedata
AT davidkho predictingfuturehighcostschizophreniapatientsusinghighdimensionaladministrativedata
AT erinfalconer predictingfuturehighcostschizophreniapatientsusinghighdimensionaladministrativedata
AT johnpdocherty predictingfuturehighcostschizophreniapatientsusinghighdimensionaladministrativedata
AT gigiyyuen predictingfuturehighcostschizophreniapatientsusinghighdimensionaladministrativedata
_version_ 1725678766502445056