Enhanced dimensionality reduction methods for classifying malaria vector dataset using decision tree

RNA-Seq data are utilized for biological applications and decision making for classification of genes. Lots of work in recent time are focused on reducing the dimension of RNA-Seq data. Dimensionality reduction approaches have been proposed in fetching relevant information in a given data. In this s...

Full description

Bibliographic Details
Main Authors: Arowolo, Micheal Olaolu (Author), Adebiyi, Marion Olubunmi (Author), Adebiyi, Ayodele Ariyo (Author)
Format: Article
Language:English
Published: Penerbit Universiti Kebangsaan Malaysia, 2021-09.
Online Access:Get fulltext
LEADER 01785 am a22001453u 4500
001 18056
042 |a dc 
100 1 0 |a Arowolo, Micheal Olaolu  |e author 
700 1 0 |a Adebiyi, Marion Olubunmi  |e author 
700 1 0 |a Adebiyi, Ayodele Ariyo  |e author 
245 0 0 |a Enhanced dimensionality reduction methods for classifying malaria vector dataset using decision tree 
260 |b Penerbit Universiti Kebangsaan Malaysia,   |c 2021-09. 
856 |z Get fulltext  |u http://journalarticle.ukm.my/18056/1/7.pdf 
520 |a RNA-Seq data are utilized for biological applications and decision making for classification of genes. Lots of work in recent time are focused on reducing the dimension of RNA-Seq data. Dimensionality reduction approaches have been proposed in fetching relevant information in a given data. In this study, a novel optimized dimensionality reduction algorithm is proposed, by combining an optimized genetic algorithm with Principal Component Analysis and Independent Component Analysis (GA-O-PCA and GAO-ICA), which are used to identify an optimum subset and latent correlated features, respectively. The classifier uses Decision tree on the reduced mosquito anopheles gambiae dataset to enhance the accuracy and scalability in the gene expression analysis. The proposed algorithm is used to fetch relevant features based from the high-dimensional input feature space. A feature ranking and earlier experience are used. The performances of the model are evaluated and validated using the classification accuracy to compare existing approaches in the literature. The achieved experimental results prove to be promising for feature selection and classification in gene expression data analysis and specify that the approach is a capable accumulation to prevailing data mining techniques. 
546 |a en