Partitioning of functional gene expression data using principal points

Abstract Background DNA microarrays offer motivation and hope for the simultaneous study of variations in multiple genes. Gene expression is a temporal process that allows variations in expression levels with a characterized gene function over a period of time. Temporal gene expression curves can be...

Full description

Bibliographic Details
Main Authors:	Jaehee Kim, Haseong Kim
Format:	Article
Language:	English
Published:	BMC 2017-10-01
Series:	BMC Bioinformatics
Subjects:	Fourier coefficients Legendre polynomials Escherichia coli Microarray expression data K-means clustering Principal points Silhouette
Online Access:	http://link.springer.com/article/10.1186/s12859-017-1860-0

id	doaj-90cf00690cc14d8e9018f90eee2c7007
record_format	Article
spelling	doaj-90cf00690cc14d8e9018f90eee2c70072020-11-24T21:17:08ZengBMCBMC Bioinformatics1471-21052017-10-0118111710.1186/s12859-017-1860-0Partitioning of functional gene expression data using principal pointsJaehee Kim0Haseong Kim1Department of Statistics, Duksung Women’s UniversityKorea Research Institute of Bioscience and Biotechnology (KRIBB)Abstract Background DNA microarrays offer motivation and hope for the simultaneous study of variations in multiple genes. Gene expression is a temporal process that allows variations in expression levels with a characterized gene function over a period of time. Temporal gene expression curves can be treated as functional data since they are considered as independent realizations of a stochastic process. This process requires appropriate models to identify patterns of gene functions. The partitioning of the functional data can find homogeneous subgroups of entities for the massive genes within the inherent biological networks. Therefor it can be a useful technique for the analysis of time-course gene expression data. We propose a new self-consistent partitioning method of functional coefficients for individual expression profiles based on the orthonormal basis system. Results A principal points based functional partitioning method is proposed for time-course gene expression data. The method explores the relationship between genes using Legendre coefficients as principal points to extract the features of gene functions. Our proposed method provides high connectivity in connectedness after clustering for simulated data and finds a significant subsets of genes with the increased connectivity. Our approach has comparative advantages that fewer coefficients are used from the functional data and self-consistency of principal points for partitioning. As real data applications, we are able to find partitioned genes through the gene expressions found in budding yeast data and Escherichia coli data. Conclusions The proposed method benefitted from the use of principal points, dimension reduction, and choice of orthogonal basis system as well as provides appropriately connected genes in the resulting subsets. We illustrate our method by applying with each set of cell-cycle-regulated time-course yeast genes and E. coli genes. The proposed method is able to identify highly connected genes and to explore the complex dynamics of biological systems in functional genomics.http://link.springer.com/article/10.1186/s12859-017-1860-0Fourier coefficientsLegendre polynomialsEscherichia coli Microarray expression dataK-means clusteringPrincipal pointsSilhouette
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Jaehee Kim Haseong Kim
spellingShingle	Jaehee Kim Haseong Kim Partitioning of functional gene expression data using principal points BMC Bioinformatics Fourier coefficients Legendre polynomials Escherichia coli Microarray expression data K-means clustering Principal points Silhouette
author_facet	Jaehee Kim Haseong Kim
author_sort	Jaehee Kim
title	Partitioning of functional gene expression data using principal points
title_short	Partitioning of functional gene expression data using principal points
title_full	Partitioning of functional gene expression data using principal points
title_fullStr	Partitioning of functional gene expression data using principal points
title_full_unstemmed	Partitioning of functional gene expression data using principal points
title_sort	partitioning of functional gene expression data using principal points
publisher	BMC
series	BMC Bioinformatics
issn	1471-2105
publishDate	2017-10-01
description	Abstract Background DNA microarrays offer motivation and hope for the simultaneous study of variations in multiple genes. Gene expression is a temporal process that allows variations in expression levels with a characterized gene function over a period of time. Temporal gene expression curves can be treated as functional data since they are considered as independent realizations of a stochastic process. This process requires appropriate models to identify patterns of gene functions. The partitioning of the functional data can find homogeneous subgroups of entities for the massive genes within the inherent biological networks. Therefor it can be a useful technique for the analysis of time-course gene expression data. We propose a new self-consistent partitioning method of functional coefficients for individual expression profiles based on the orthonormal basis system. Results A principal points based functional partitioning method is proposed for time-course gene expression data. The method explores the relationship between genes using Legendre coefficients as principal points to extract the features of gene functions. Our proposed method provides high connectivity in connectedness after clustering for simulated data and finds a significant subsets of genes with the increased connectivity. Our approach has comparative advantages that fewer coefficients are used from the functional data and self-consistency of principal points for partitioning. As real data applications, we are able to find partitioned genes through the gene expressions found in budding yeast data and Escherichia coli data. Conclusions The proposed method benefitted from the use of principal points, dimension reduction, and choice of orthogonal basis system as well as provides appropriately connected genes in the resulting subsets. We illustrate our method by applying with each set of cell-cycle-regulated time-course yeast genes and E. coli genes. The proposed method is able to identify highly connected genes and to explore the complex dynamics of biological systems in functional genomics.
topic	Fourier coefficients Legendre polynomials Escherichia coli Microarray expression data K-means clustering Principal points Silhouette
url	http://link.springer.com/article/10.1186/s12859-017-1860-0
work_keys_str_mv	AT jaeheekim partitioningoffunctionalgeneexpressiondatausingprincipalpoints AT haseongkim partitioningoffunctionalgeneexpressiondatausingprincipalpoints
_version_	1726014007267033088

Partitioning of functional gene expression data using principal points

Similar Items