Dimension reduction methods for microarray data: a review

Dimension reduction has become inevitable for pre-processing of high dimensional data. “Gene expression microarray data” is an instance of such high dimensional data. Gene expression microarray data displays the maximum number of genes (features) simultaneously at a molecular level with a very small...

Full description

Bibliographic Details
Main Authors: Rabia Aziz, C.K. Verma, Namita Srivastava
Format: Article
Language:English
Published: AIMS Press 2017-03-01
Series:AIMS Bioengineering
Subjects:
Online Access:http://www.aimspress.com/Bioengineering/article/1315/fulltext.html
Description
Summary:Dimension reduction has become inevitable for pre-processing of high dimensional data. “Gene expression microarray data” is an instance of such high dimensional data. Gene expression microarray data displays the maximum number of genes (features) simultaneously at a molecular level with a very small number of samples. The copious numbers of genes are usually provided to a learning algorithm for producing a complete characterization of the classification task. However, most of the times the majority of the genes are irrelevant or redundant to the learning task. It will deteriorate the learning accuracy and training speed as well as lead to the problem of overfitting. Thus, dimension reduction of microarray data is a crucial preprocessing step for prediction and classification of disease. Various feature selection and feature extraction techniques have been proposed in the literature to identify the genes, that have direct impact on the various machine learning algorithms for classification and eliminate the remaining ones. This paper describes the taxonomy of dimension reduction methods with their characteristics, evaluation criteria, advantages and disadvantages. It also presents a review of numerous dimension reduction approaches for microarray data, mainly those methods that have been proposed over the past few years.
ISSN:2375-1495