Prediction of hematocrit through imbalanced dataset of blood spectra

Abstract In spite of machine learning has been successfully used in a wide range of healthcare applications, there are several parameters that could influence the performance of a machine learning system. One of the big issues for a machine learning algorithm is related to imbalanced dataset. An imb...

Full description

Bibliographic Details
Main Authors: Cristoforo Decaro, Giovanni Battista Montanari, Marco Bianconi, Gaetano Bellanca
Format: Article
Language:English
Published: Wiley 2021-04-01
Series:Healthcare Technology Letters
Online Access:https://doi.org/10.1049/htl2.12006
Description
Summary:Abstract In spite of machine learning has been successfully used in a wide range of healthcare applications, there are several parameters that could influence the performance of a machine learning system. One of the big issues for a machine learning algorithm is related to imbalanced dataset. An imbalanced dataset occurs when the distribution of data is not uniform. This makes harder the implementation of accurate models. In this paper, intelligent models are implemented to predict the hematocrit level of blood starting from visible spectral data. The aim of this work is to show the effects of two balancing techniques (SMOTE and SMOTE+ENN) on the imbalanced dataset of blood spectra. Four different machine learning systems are fitted with imbalanced and balanced datasets and their performances are compared showing an improvement, in terms of accuracy, due to the use of balancing.
ISSN:2053-3713