On the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays
Microarray datasets are highly dimensional, with a small number of collected samples in comparison to thousands of features. This poses a significant challenge that affects the interpretation, applicability and validation of the analytical results. Matrix factorizations have proven to be a useful me...
Main Author: | |
---|---|
Format: | Article |
Language: | Russian |
Published: |
Institute of Computer Science
2013-04-01
|
Series: | Компьютерные исследования и моделирование |
Subjects: | |
Online Access: | http://crm.ics.org.ru/uploads/crmissues/crm_2013_2/13202.pdf |
id |
doaj-6c9ff541c4ce4f4584f06bdee6a0ba34 |
---|---|
record_format |
Article |
spelling |
doaj-6c9ff541c4ce4f4584f06bdee6a0ba342020-11-24T21:30:42ZrusInstitute of Computer ScienceКомпьютерные исследования и моделирование2076-76332077-68532013-04-015213114010.20537/2076-7633-2013-5-2-131-1402006On the stochastic gradient descent matrix factorization in application to the supervised classification of microarraysVladimir Nikolaevich NikulinMicroarray datasets are highly dimensional, with a small number of collected samples in comparison to thousands of features. This poses a significant challenge that affects the interpretation, applicability and validation of the analytical results. Matrix factorizations have proven to be a useful method for describing data in terms of a small number of meta-features, which reduces noise, while still capturing the essential features of the data. Three novel and mutually relevant methods are presented in this paper: 1) gradient-based matrix factorization with two adaptive learning rates (in accordance with the number of factor matrices) and their automatic updates; 2) nonparametric criterion for the selection of the number of factors; and 3) nonnegative version of the gradient-based matrix factorization which doesn't require any extra computational costs in difference to the existing methods. We demonstrate effectiveness of the proposed methods to the supervised classification of gene expression data.http://crm.ics.org.ru/uploads/crmissues/crm_2013_2/13202.pdfmatrix factorizationunsupervised learningnumber of factorsnonnegativitybioinformaticsleaveone-outclassification |
collection |
DOAJ |
language |
Russian |
format |
Article |
sources |
DOAJ |
author |
Vladimir Nikolaevich Nikulin |
spellingShingle |
Vladimir Nikolaevich Nikulin On the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays Компьютерные исследования и моделирование matrix factorization unsupervised learning number of factors nonnegativity bioinformatics leaveone-out classification |
author_facet |
Vladimir Nikolaevich Nikulin |
author_sort |
Vladimir Nikolaevich Nikulin |
title |
On the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays |
title_short |
On the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays |
title_full |
On the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays |
title_fullStr |
On the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays |
title_full_unstemmed |
On the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays |
title_sort |
on the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays |
publisher |
Institute of Computer Science |
series |
Компьютерные исследования и моделирование |
issn |
2076-7633 2077-6853 |
publishDate |
2013-04-01 |
description |
Microarray datasets are highly dimensional, with a small number of collected samples in comparison to thousands of features. This poses a significant challenge that affects the interpretation, applicability and validation of the analytical results. Matrix factorizations have proven to be a useful method for describing data in terms of a small number of meta-features, which reduces noise, while still capturing the essential features of the data. Three novel and mutually relevant methods are presented in this paper: 1) gradient-based matrix factorization with two adaptive learning rates (in accordance with the number of factor matrices) and their automatic updates; 2) nonparametric criterion for the selection of the number of factors; and 3) nonnegative version of the gradient-based matrix factorization which doesn't require any extra computational costs in difference to the existing methods. We demonstrate effectiveness of the proposed methods to the supervised classification of gene expression data. |
topic |
matrix factorization unsupervised learning number of factors nonnegativity bioinformatics leaveone-out classification |
url |
http://crm.ics.org.ru/uploads/crmissues/crm_2013_2/13202.pdf |
work_keys_str_mv |
AT vladimirnikolaevichnikulin onthestochasticgradientdescentmatrixfactorizationinapplicationtothesupervisedclassificationofmicroarrays |
_version_ |
1725962106023444480 |