Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.
BACKGROUND: Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly eval...
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2012-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC3511551?pdf=render |
id |
doaj-0b383d0de9014bdea1be451a93ef24fd |
---|---|
record_format |
Article |
spelling |
doaj-0b383d0de9014bdea1be451a93ef24fd2020-11-25T02:09:26ZengPublic Library of Science (PLoS)PLoS ONE1932-62032012-01-01711e5041110.1371/journal.pone.0050411Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.Sapna KumariJeff NieHuann-Sheng ChenHao MaRon StewartXiang LiMeng-Zhu LuWilliam M TaylorHairong WeiBACKGROUND: Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. METHODS AND RESULTS: In this study, we compared eight gene association methods - Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson - and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. CONCLUSIONS: We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction.http://europepmc.org/articles/PMC3511551?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Sapna Kumari Jeff Nie Huann-Sheng Chen Hao Ma Ron Stewart Xiang Li Meng-Zhu Lu William M Taylor Hairong Wei |
spellingShingle |
Sapna Kumari Jeff Nie Huann-Sheng Chen Hao Ma Ron Stewart Xiang Li Meng-Zhu Lu William M Taylor Hairong Wei Evaluation of gene association methods for coexpression network construction and biological knowledge discovery. PLoS ONE |
author_facet |
Sapna Kumari Jeff Nie Huann-Sheng Chen Hao Ma Ron Stewart Xiang Li Meng-Zhu Lu William M Taylor Hairong Wei |
author_sort |
Sapna Kumari |
title |
Evaluation of gene association methods for coexpression network construction and biological knowledge discovery. |
title_short |
Evaluation of gene association methods for coexpression network construction and biological knowledge discovery. |
title_full |
Evaluation of gene association methods for coexpression network construction and biological knowledge discovery. |
title_fullStr |
Evaluation of gene association methods for coexpression network construction and biological knowledge discovery. |
title_full_unstemmed |
Evaluation of gene association methods for coexpression network construction and biological knowledge discovery. |
title_sort |
evaluation of gene association methods for coexpression network construction and biological knowledge discovery. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2012-01-01 |
description |
BACKGROUND: Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. METHODS AND RESULTS: In this study, we compared eight gene association methods - Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson - and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. CONCLUSIONS: We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction. |
url |
http://europepmc.org/articles/PMC3511551?pdf=render |
work_keys_str_mv |
AT sapnakumari evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery AT jeffnie evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery AT huannshengchen evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery AT haoma evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery AT ronstewart evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery AT xiangli evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery AT mengzhulu evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery AT williammtaylor evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery AT hairongwei evaluationofgeneassociationmethodsforcoexpressionnetworkconstructionandbiologicalknowledgediscovery |
_version_ |
1724923806354505728 |