Deep learning approach for microarray cancer data classification

Analysis of microarray data is a highly challenging problem due to the inherent complexity in the nature of the data associated with higher dimensionality, smaller sample size, imbalanced number of classes, noisy data-structure, and higher variance of feature values. This has led to lesser classific...

Full description

Bibliographic Details
Main Authors: Hema Shekar Basavegowda, Guesh Dagnew
Format: Article
Language:English
Published: Wiley 2019-12-01
Series:CAAI Transactions on Intelligence Technology
Subjects:
Online Access:https://digital-library.theiet.org/content/journals/10.1049/trit.2019.0028
Description
Summary:Analysis of microarray data is a highly challenging problem due to the inherent complexity in the nature of the data associated with higher dimensionality, smaller sample size, imbalanced number of classes, noisy data-structure, and higher variance of feature values. This has led to lesser classification accuracy and over-fitting problem. In this work, the authors aimed to develop a deep feedforward method to classify the given microarray cancer data into a set of classes for subsequent diagnosis purposes. They have used a 7-layer deep neural network architecture having various parameters for each dataset. The small sample size and dimensionality problems are addressed by considering a well-known dimensionality reduction technique namely principal component analysis. The feature values are scaled using the Min–Max approach and the proposed approach is validated on eight standard microarray cancer datasets. To measure the loss, a binary cross-entropy is used and adaptive moment estimation is considered for optimisation. The performance of the proposed approach is evaluated using classification accuracy, precision, recall, f-measure, log-loss, receiver operating characteristic curve, and confusion matrix. A comparative analysis with state-of-the-art methods is carried out and the performance of the proposed approach exhibit better performance than many of the existing methods.
ISSN:2468-2322