On the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays

Microarray datasets are highly dimensional, with a small number of collected samples in comparison to thousands of features. This poses a significant challenge that affects the interpretation, applicability and validation of the analytical results. Matrix factorizations have proven to be a useful me...

Full description

Bibliographic Details
Main Author: Vladimir Nikolaevich Nikulin
Format: Article
Language:Russian
Published: Institute of Computer Science 2013-04-01
Series:Компьютерные исследования и моделирование
Subjects:
Online Access:http://crm.ics.org.ru/uploads/crmissues/crm_2013_2/13202.pdf
id doaj-6c9ff541c4ce4f4584f06bdee6a0ba34
record_format Article
spelling doaj-6c9ff541c4ce4f4584f06bdee6a0ba342020-11-24T21:30:42ZrusInstitute of Computer ScienceКомпьютерные исследования и моделирование2076-76332077-68532013-04-015213114010.20537/2076-7633-2013-5-2-131-1402006On the stochastic gradient descent matrix factorization in application to the supervised classification of microarraysVladimir Nikolaevich NikulinMicroarray datasets are highly dimensional, with a small number of collected samples in comparison to thousands of features. This poses a significant challenge that affects the interpretation, applicability and validation of the analytical results. Matrix factorizations have proven to be a useful method for describing data in terms of a small number of meta-features, which reduces noise, while still capturing the essential features of the data. Three novel and mutually relevant methods are presented in this paper: 1) gradient-based matrix factorization with two adaptive learning rates (in accordance with the number of factor matrices) and their automatic updates; 2) nonparametric criterion for the selection of the number of factors; and 3) nonnegative version of the gradient-based matrix factorization which doesn't require any extra computational costs in difference to the existing methods. We demonstrate effectiveness of the proposed methods to the supervised classification of gene expression data.http://crm.ics.org.ru/uploads/crmissues/crm_2013_2/13202.pdfmatrix factorizationunsupervised learningnumber of factorsnonnegativitybioinformaticsleaveone-outclassification
collection DOAJ
language Russian
format Article
sources DOAJ
author Vladimir Nikolaevich Nikulin
spellingShingle Vladimir Nikolaevich Nikulin
On the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays
Компьютерные исследования и моделирование
matrix factorization
unsupervised learning
number of factors
nonnegativity
bioinformatics
leaveone-out
classification
author_facet Vladimir Nikolaevich Nikulin
author_sort Vladimir Nikolaevich Nikulin
title On the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays
title_short On the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays
title_full On the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays
title_fullStr On the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays
title_full_unstemmed On the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays
title_sort on the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays
publisher Institute of Computer Science
series Компьютерные исследования и моделирование
issn 2076-7633
2077-6853
publishDate 2013-04-01
description Microarray datasets are highly dimensional, with a small number of collected samples in comparison to thousands of features. This poses a significant challenge that affects the interpretation, applicability and validation of the analytical results. Matrix factorizations have proven to be a useful method for describing data in terms of a small number of meta-features, which reduces noise, while still capturing the essential features of the data. Three novel and mutually relevant methods are presented in this paper: 1) gradient-based matrix factorization with two adaptive learning rates (in accordance with the number of factor matrices) and their automatic updates; 2) nonparametric criterion for the selection of the number of factors; and 3) nonnegative version of the gradient-based matrix factorization which doesn't require any extra computational costs in difference to the existing methods. We demonstrate effectiveness of the proposed methods to the supervised classification of gene expression data.
topic matrix factorization
unsupervised learning
number of factors
nonnegativity
bioinformatics
leaveone-out
classification
url http://crm.ics.org.ru/uploads/crmissues/crm_2013_2/13202.pdf
work_keys_str_mv AT vladimirnikolaevichnikulin onthestochasticgradientdescentmatrixfactorizationinapplicationtothesupervisedclassificationofmicroarrays
_version_ 1725962106023444480