Detection and Classification of Immature Leukocytes for Diagnosis of Acute Myeloid Leukemia Using Random Forest Algorithm

Acute myeloid leukemia (AML) is a fatal blood cancer that progresses rapidly and hinders the function of blood cells and the immune system. The current AML diagnostic method, a manual examination of the peripheral blood smear, is time consuming, labor intensive, and suffers from considerable inter-o...

Full description

Bibliographic Details
Main Authors: Satvik Dasariraju, Marc Huo, Serena McCalla
Format: Article
Language:English
Published: MDPI AG 2020-10-01
Series:Bioengineering
Subjects:
Online Access:https://www.mdpi.com/2306-5354/7/4/120
id doaj-b5d2581c37ca4a40a2b3a418cc08d2b7
record_format Article
spelling doaj-b5d2581c37ca4a40a2b3a418cc08d2b72020-11-25T01:46:33ZengMDPI AGBioengineering2306-53542020-10-01712012010.3390/bioengineering7040120Detection and Classification of Immature Leukocytes for Diagnosis of Acute Myeloid Leukemia Using Random Forest AlgorithmSatvik Dasariraju0Marc Huo1Serena McCalla2iResearch Institute, Glen Cove, NY 11542, USAiResearch Institute, Glen Cove, NY 11542, USAiResearch Institute, Glen Cove, NY 11542, USAAcute myeloid leukemia (AML) is a fatal blood cancer that progresses rapidly and hinders the function of blood cells and the immune system. The current AML diagnostic method, a manual examination of the peripheral blood smear, is time consuming, labor intensive, and suffers from considerable inter-observer variation. Herein, a machine learning model to detect and classify immature leukocytes for efficient diagnosis of AML is presented. Images of leukocytes in AML patients and healthy controls were obtained from a publicly available dataset in The Cancer Imaging Archive. Image format conversion, multi-Otsu thresholding, and morphological operations were used for segmentation of the nucleus and cytoplasm. From each image, 16 features were extracted, two of which are new nucleus color features proposed in this study. A random forest algorithm was trained for the detection and classification of immature leukocytes. The model achieved 92.99% accuracy for detection and 93.45% accuracy for classification of immature leukocytes into four types. Precision values for each class were above 65%, which is an improvement on the current state of art. Based on Gini importance, the nucleus to cytoplasm area ratio was a discriminative feature for both detection and classification, while the two proposed features were shown to be significant for classification. The proposed model can be used as a support tool for the diagnosis of AML, and the features calculated to be most important serve as a baseline for future research.https://www.mdpi.com/2306-5354/7/4/120acute myeloid leukemiaperipheral blood smearimmature leukocytesegmentationcytomorphologymachine learning
collection DOAJ
language English
format Article
sources DOAJ
author Satvik Dasariraju
Marc Huo
Serena McCalla
spellingShingle Satvik Dasariraju
Marc Huo
Serena McCalla
Detection and Classification of Immature Leukocytes for Diagnosis of Acute Myeloid Leukemia Using Random Forest Algorithm
Bioengineering
acute myeloid leukemia
peripheral blood smear
immature leukocyte
segmentation
cytomorphology
machine learning
author_facet Satvik Dasariraju
Marc Huo
Serena McCalla
author_sort Satvik Dasariraju
title Detection and Classification of Immature Leukocytes for Diagnosis of Acute Myeloid Leukemia Using Random Forest Algorithm
title_short Detection and Classification of Immature Leukocytes for Diagnosis of Acute Myeloid Leukemia Using Random Forest Algorithm
title_full Detection and Classification of Immature Leukocytes for Diagnosis of Acute Myeloid Leukemia Using Random Forest Algorithm
title_fullStr Detection and Classification of Immature Leukocytes for Diagnosis of Acute Myeloid Leukemia Using Random Forest Algorithm
title_full_unstemmed Detection and Classification of Immature Leukocytes for Diagnosis of Acute Myeloid Leukemia Using Random Forest Algorithm
title_sort detection and classification of immature leukocytes for diagnosis of acute myeloid leukemia using random forest algorithm
publisher MDPI AG
series Bioengineering
issn 2306-5354
publishDate 2020-10-01
description Acute myeloid leukemia (AML) is a fatal blood cancer that progresses rapidly and hinders the function of blood cells and the immune system. The current AML diagnostic method, a manual examination of the peripheral blood smear, is time consuming, labor intensive, and suffers from considerable inter-observer variation. Herein, a machine learning model to detect and classify immature leukocytes for efficient diagnosis of AML is presented. Images of leukocytes in AML patients and healthy controls were obtained from a publicly available dataset in The Cancer Imaging Archive. Image format conversion, multi-Otsu thresholding, and morphological operations were used for segmentation of the nucleus and cytoplasm. From each image, 16 features were extracted, two of which are new nucleus color features proposed in this study. A random forest algorithm was trained for the detection and classification of immature leukocytes. The model achieved 92.99% accuracy for detection and 93.45% accuracy for classification of immature leukocytes into four types. Precision values for each class were above 65%, which is an improvement on the current state of art. Based on Gini importance, the nucleus to cytoplasm area ratio was a discriminative feature for both detection and classification, while the two proposed features were shown to be significant for classification. The proposed model can be used as a support tool for the diagnosis of AML, and the features calculated to be most important serve as a baseline for future research.
topic acute myeloid leukemia
peripheral blood smear
immature leukocyte
segmentation
cytomorphology
machine learning
url https://www.mdpi.com/2306-5354/7/4/120
work_keys_str_mv AT satvikdasariraju detectionandclassificationofimmatureleukocytesfordiagnosisofacutemyeloidleukemiausingrandomforestalgorithm
AT marchuo detectionandclassificationofimmatureleukocytesfordiagnosisofacutemyeloidleukemiausingrandomforestalgorithm
AT serenamccalla detectionandclassificationofimmatureleukocytesfordiagnosisofacutemyeloidleukemiausingrandomforestalgorithm
_version_ 1725018680144691200