Summary: | The emergence of personalized medicine and its exceptional advancements reveal new needs regarding the availability of adequate medical decision-making models. Considering detailed data on this medicine, the creation of a medical decision-making system may encounter many inhibitory factors, such as data representation, data reduction, data classification, and overall processing complexity. To address these challenges, this paper aims to create a useful model that can classify new patient data using efficient computations by choosing the best data processing series. Our methodology represents data with a recent model in the first task. During the second task, we continue with distance matrix production. The third task aims to reduce the dimensions of the last matrix. The fourth task applies a classification according to the results of reduced dimensionality. We have tested several distance measurements, dimensionality reduction methods, and classification techniques to achieve maximum performance. The evaluation results of the proposed model have shown excellent performance. Its F-measure can achieve an impressive rating with several classifiers (F-measure = 0.917, F-measure = 0.923, F-measure = 0.987 by 3-NN, random forest (RF) and support vector machine (SVM) classifiers). In addition to these performance measures, the computation time is also taken into account to choose among the proposed model's derived methods (time = 2 ms, time = 76 ms, time = 118 ms for the 3-NN, RF, and SVM classifiers, respectively). According to the performance and processing time criteria, we defined three use-case scenarios. However, we recommend using the RF classifier for the data reduced by the t-distributed stochastic neighbor embedding (TSNE) technique in practical cases to compromise performance and speed criteria.
|