Summary: | A decision support system with data-driven methods is of great significance for the prognosis of scoliosis. However, developing an accurate and interpretable data-driven decision support system is challenging: 1) the scoliosis data collected from clinical environments is heterogeneous, unstructured, and incomplete; 2) the cause of adolescent idiopathic scoliosis is still unknown, and the effects of some measured indicators are not clear; and 3) some treatments like wearing a brace will affect the progression of scoliosis. The main contributions of the paper include: 1) propose and incorporate different imputation methods like Local Linear Interpolation (LLI) and Global Statistic Approximation (GSA) to deal with complicated types of incomplete data in clinical environments; 2) identify important features that are relevant to the severity of scoliosis with embedded method; and 3) establish and compare the scoliosis prediction models with multiple linear regression, k nearest neighbor, tree, support vector machine, and random forest algorithms. The prediction performance is evaluated in terms of mean absolute error, root mean square error, mean absolute percentage error, and the Pearson correlation coefficient. With only a few critical features, the prediction models can achieve satisfactory performance. Experiments show that the models are highly interpretable and viable to support the decision-making in clinical environments.
|