Predicting the clinical management of skin lesions using deep learning

Abstract Automated machine learning approaches to skin lesion diagnosis from images are approaching dermatologist-level performance. However, current machine learning approaches that suggest management decisions rely on predicting the underlying skin condition to infer a management decision without...

Full description

Bibliographic Details
Main Authors: Kumar Abhishek, Jeremy Kawahara, Ghassan Hamarneh
Format: Article
Language:English
Published: Nature Publishing Group 2021-04-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-021-87064-7
Description
Summary:Abstract Automated machine learning approaches to skin lesion diagnosis from images are approaching dermatologist-level performance. However, current machine learning approaches that suggest management decisions rely on predicting the underlying skin condition to infer a management decision without considering the variability of management decisions that may exist within a single condition. We present the first work to explore image-based prediction of clinical management decisions directly without explicitly predicting the diagnosis. In particular, we use clinical and dermoscopic images of skin lesions along with patient metadata from the Interactive Atlas of Dermoscopy dataset (1011 cases; 20 disease labels; 3 management decisions) and demonstrate that predicting management labels directly is more accurate than predicting the diagnosis and then inferring the management decision ( $$13.73 \pm 3.93\%$$ 13.73 ± 3.93 % and $$6.59 \pm 2.86\%$$ 6.59 ± 2.86 % improvement in overall accuracy and AUROC respectively), statistically significant at $$p < 0.001$$ p < 0.001 . Directly predicting management decisions also considerably reduces the over-excision rate as compared to management decisions inferred from diagnosis predictions (24.56% fewer cases wrongly predicted to be excised). Furthermore, we show that training a model to also simultaneously predict the seven-point criteria and the diagnosis of skin lesions yields an even higher accuracy (improvements of $$4.68 \pm 1.89\%$$ 4.68 ± 1.89 % and $$2.24 \pm 2.04\%$$ 2.24 ± 2.04 % in overall accuracy and AUROC respectively) of management predictions. Finally, we demonstrate our model’s generalizability by evaluating on the publicly available MClass-D dataset and show that our model agrees with the clinical management recommendations of 157 dermatologists as much as they agree amongst each other.
ISSN:2045-2322