Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.

BACKGROUND: Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly eval...

Full description

Bibliographic Details
Main Authors: Sapna Kumari, Jeff Nie, Huann-Sheng Chen, Hao Ma, Ron Stewart, Xiang Li, Meng-Zhu Lu, William M Taylor, Hairong Wei
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2012-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3511551?pdf=render
id doaj-0b383d0de9014bdea1be451a93ef24fd
record_format Article
spelling doaj-0b383d0de9014bdea1be451a93ef24fd2020-11-25T02:09:26ZengPublic Library of Science (PLoS)PLoS ONE1932-62032012-01-01711e5041110.1371/journal.pone.0050411Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.Sapna KumariJeff NieHuann-Sheng ChenHao MaRon StewartXiang LiMeng-Zhu LuWilliam M TaylorHairong WeiBACKGROUND: Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. METHODS AND RESULTS: In this study, we compared eight gene association methods - Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson - and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. CONCLUSIONS: We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction.http://europepmc.org/articles/PMC3511551?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Sapna Kumari
Jeff Nie
Huann-Sheng Chen
Hao Ma
Ron Stewart
Xiang Li
Meng-Zhu Lu
William M Taylor
Hairong Wei
spellingShingle Sapna Kumari
Jeff Nie
Huann-Sheng Chen
Hao Ma
Ron Stewart
Xiang Li
Meng-Zhu Lu
William M Taylor
Hairong Wei
Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.
PLoS ONE
author_facet Sapna Kumari
Jeff Nie
Huann-Sheng Chen
Hao Ma
Ron Stewart
Xiang Li
Meng-Zhu Lu
William M Taylor
Hairong Wei
author_sort Sapna Kumari
title Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.
title_short Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.
title_full Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.
title_fullStr Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.
title_full_unstemmed Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.
title_sort evaluation of gene association methods for coexpression network construction and biological knowledge discovery.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2012-01-01
description BACKGROUND: Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. METHODS AND RESULTS: In this study, we compared eight gene association methods - Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson - and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. CONCLUSIONS: We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction.
url http://europepmc.org/articles/PMC3511551?pdf=render
work_keys_str_mv AT sapnakumari evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery
AT jeffnie evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery
AT huannshengchen evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery
AT haoma evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery
AT ronstewart evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery
AT xiangli evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery
AT mengzhulu evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery
AT williammtaylor evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery
AT hairongwei evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery
_version_ 1724923806354505728