Partitioning of functional gene expression data using principal points
Abstract Background DNA microarrays offer motivation and hope for the simultaneous study of variations in multiple genes. Gene expression is a temporal process that allows variations in expression levels with a characterized gene function over a period of time. Temporal gene expression curves can be...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2017-10-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12859-017-1860-0 |
id |
doaj-90cf00690cc14d8e9018f90eee2c7007 |
---|---|
record_format |
Article |
spelling |
doaj-90cf00690cc14d8e9018f90eee2c70072020-11-24T21:17:08ZengBMCBMC Bioinformatics1471-21052017-10-0118111710.1186/s12859-017-1860-0Partitioning of functional gene expression data using principal pointsJaehee Kim0Haseong Kim1Department of Statistics, Duksung Women’s UniversityKorea Research Institute of Bioscience and Biotechnology (KRIBB)Abstract Background DNA microarrays offer motivation and hope for the simultaneous study of variations in multiple genes. Gene expression is a temporal process that allows variations in expression levels with a characterized gene function over a period of time. Temporal gene expression curves can be treated as functional data since they are considered as independent realizations of a stochastic process. This process requires appropriate models to identify patterns of gene functions. The partitioning of the functional data can find homogeneous subgroups of entities for the massive genes within the inherent biological networks. Therefor it can be a useful technique for the analysis of time-course gene expression data. We propose a new self-consistent partitioning method of functional coefficients for individual expression profiles based on the orthonormal basis system. Results A principal points based functional partitioning method is proposed for time-course gene expression data. The method explores the relationship between genes using Legendre coefficients as principal points to extract the features of gene functions. Our proposed method provides high connectivity in connectedness after clustering for simulated data and finds a significant subsets of genes with the increased connectivity. Our approach has comparative advantages that fewer coefficients are used from the functional data and self-consistency of principal points for partitioning. As real data applications, we are able to find partitioned genes through the gene expressions found in budding yeast data and Escherichia coli data. Conclusions The proposed method benefitted from the use of principal points, dimension reduction, and choice of orthogonal basis system as well as provides appropriately connected genes in the resulting subsets. We illustrate our method by applying with each set of cell-cycle-regulated time-course yeast genes and E. coli genes. The proposed method is able to identify highly connected genes and to explore the complex dynamics of biological systems in functional genomics.http://link.springer.com/article/10.1186/s12859-017-1860-0Fourier coefficientsLegendre polynomialsEscherichia coli Microarray expression dataK-means clusteringPrincipal pointsSilhouette |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Jaehee Kim Haseong Kim |
spellingShingle |
Jaehee Kim Haseong Kim Partitioning of functional gene expression data using principal points BMC Bioinformatics Fourier coefficients Legendre polynomials Escherichia coli Microarray expression data K-means clustering Principal points Silhouette |
author_facet |
Jaehee Kim Haseong Kim |
author_sort |
Jaehee Kim |
title |
Partitioning of functional gene expression data using principal points |
title_short |
Partitioning of functional gene expression data using principal points |
title_full |
Partitioning of functional gene expression data using principal points |
title_fullStr |
Partitioning of functional gene expression data using principal points |
title_full_unstemmed |
Partitioning of functional gene expression data using principal points |
title_sort |
partitioning of functional gene expression data using principal points |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2017-10-01 |
description |
Abstract Background DNA microarrays offer motivation and hope for the simultaneous study of variations in multiple genes. Gene expression is a temporal process that allows variations in expression levels with a characterized gene function over a period of time. Temporal gene expression curves can be treated as functional data since they are considered as independent realizations of a stochastic process. This process requires appropriate models to identify patterns of gene functions. The partitioning of the functional data can find homogeneous subgroups of entities for the massive genes within the inherent biological networks. Therefor it can be a useful technique for the analysis of time-course gene expression data. We propose a new self-consistent partitioning method of functional coefficients for individual expression profiles based on the orthonormal basis system. Results A principal points based functional partitioning method is proposed for time-course gene expression data. The method explores the relationship between genes using Legendre coefficients as principal points to extract the features of gene functions. Our proposed method provides high connectivity in connectedness after clustering for simulated data and finds a significant subsets of genes with the increased connectivity. Our approach has comparative advantages that fewer coefficients are used from the functional data and self-consistency of principal points for partitioning. As real data applications, we are able to find partitioned genes through the gene expressions found in budding yeast data and Escherichia coli data. Conclusions The proposed method benefitted from the use of principal points, dimension reduction, and choice of orthogonal basis system as well as provides appropriately connected genes in the resulting subsets. We illustrate our method by applying with each set of cell-cycle-regulated time-course yeast genes and E. coli genes. The proposed method is able to identify highly connected genes and to explore the complex dynamics of biological systems in functional genomics. |
topic |
Fourier coefficients Legendre polynomials Escherichia coli Microarray expression data K-means clustering Principal points Silhouette |
url |
http://link.springer.com/article/10.1186/s12859-017-1860-0 |
work_keys_str_mv |
AT jaeheekim partitioningoffunctionalgeneexpressiondatausingprincipalpoints AT haseongkim partitioningoffunctionalgeneexpressiondatausingprincipalpoints |
_version_ |
1726014007267033088 |