Gene Tree Labeling Using Nonnegative Matrix Factorization on Biomedical Literature

Identifying functional groups of genes is a challenging problem for biological applications. Text mining approaches can be used to build hierarchical clusters or trees from the information in the biological literature. In particular, the nonnegative matrix factorization (NMF) is examined as one appr...

Full description

Bibliographic Details
Main Authors: Kevin E. Heinrich, Michael W. Berry, Ramin Homayouni
Format: Article
Language:English
Published: Hindawi Limited 2008-01-01
Series:Computational Intelligence and Neuroscience
Online Access:http://dx.doi.org/10.1155/2008/276535
id doaj-3ce9a1cda34941f6adf641013e23ea93
record_format Article
spelling doaj-3ce9a1cda34941f6adf641013e23ea932020-11-24T22:22:28ZengHindawi LimitedComputational Intelligence and Neuroscience1687-52651687-52732008-01-01200810.1155/2008/276535276535Gene Tree Labeling Using Nonnegative Matrix Factorization on Biomedical LiteratureKevin E. Heinrich0Michael W. Berry1Ramin Homayouni2Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN 37996-3450, USADepartment of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN 37996-3450, USADepartment of Biology, University of Memphis, Memphis, TN 38152-3150, USAIdentifying functional groups of genes is a challenging problem for biological applications. Text mining approaches can be used to build hierarchical clusters or trees from the information in the biological literature. In particular, the nonnegative matrix factorization (NMF) is examined as one approach to label hierarchical trees. A generic labeling algorithm as well as an evaluation technique is proposed, and the effects of different NMF parameters with regard to convergence and labeling accuracy are discussed. The primary goals of this study are to provide a qualitative assessment of the NMF and its various parameters and initialization, to provide an automated way to classify biomedical data, and to provide a method for evaluating labeled data assuming a static input tree. As a byproduct, a method for generating gold standard trees is proposed.http://dx.doi.org/10.1155/2008/276535
collection DOAJ
language English
format Article
sources DOAJ
author Kevin E. Heinrich
Michael W. Berry
Ramin Homayouni
spellingShingle Kevin E. Heinrich
Michael W. Berry
Ramin Homayouni
Gene Tree Labeling Using Nonnegative Matrix Factorization on Biomedical Literature
Computational Intelligence and Neuroscience
author_facet Kevin E. Heinrich
Michael W. Berry
Ramin Homayouni
author_sort Kevin E. Heinrich
title Gene Tree Labeling Using Nonnegative Matrix Factorization on Biomedical Literature
title_short Gene Tree Labeling Using Nonnegative Matrix Factorization on Biomedical Literature
title_full Gene Tree Labeling Using Nonnegative Matrix Factorization on Biomedical Literature
title_fullStr Gene Tree Labeling Using Nonnegative Matrix Factorization on Biomedical Literature
title_full_unstemmed Gene Tree Labeling Using Nonnegative Matrix Factorization on Biomedical Literature
title_sort gene tree labeling using nonnegative matrix factorization on biomedical literature
publisher Hindawi Limited
series Computational Intelligence and Neuroscience
issn 1687-5265
1687-5273
publishDate 2008-01-01
description Identifying functional groups of genes is a challenging problem for biological applications. Text mining approaches can be used to build hierarchical clusters or trees from the information in the biological literature. In particular, the nonnegative matrix factorization (NMF) is examined as one approach to label hierarchical trees. A generic labeling algorithm as well as an evaluation technique is proposed, and the effects of different NMF parameters with regard to convergence and labeling accuracy are discussed. The primary goals of this study are to provide a qualitative assessment of the NMF and its various parameters and initialization, to provide an automated way to classify biomedical data, and to provide a method for evaluating labeled data assuming a static input tree. As a byproduct, a method for generating gold standard trees is proposed.
url http://dx.doi.org/10.1155/2008/276535
work_keys_str_mv AT kevineheinrich genetreelabelingusingnonnegativematrixfactorizationonbiomedicalliterature
AT michaelwberry genetreelabelingusingnonnegativematrixfactorizationonbiomedicalliterature
AT raminhomayouni genetreelabelingusingnonnegativematrixfactorizationonbiomedicalliterature
_version_ 1725768159429918720