Summary: | Abstract In spite of machine learning has been successfully used in a wide range of healthcare applications, there are several parameters that could influence the performance of a machine learning system. One of the big issues for a machine learning algorithm is related to imbalanced dataset. An imbalanced dataset occurs when the distribution of data is not uniform. This makes harder the implementation of accurate models. In this paper, intelligent models are implemented to predict the hematocrit level of blood starting from visible spectral data. The aim of this work is to show the effects of two balancing techniques (SMOTE and SMOTE+ENN) on the imbalanced dataset of blood spectra. Four different machine learning systems are fitted with imbalanced and balanced datasets and their performances are compared showing an improvement, in terms of accuracy, due to the use of balancing.
|