High-Dimensional Bayesian Network Inference From Systems Genetics Data Using Genetic Node Ordering
Studying the impact of genetic variation on gene regulatory networks is essential to understand the biological mechanisms by which genetic variation causes variation in phenotypes. Bayesian networks provide an elegant statistical approach for multi-trait genetic mapping and modelling causal trait re...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2019-12-01
|
Series: | Frontiers in Genetics |
Subjects: | |
Online Access: | https://www.frontiersin.org/article/10.3389/fgene.2019.01196/full |
id |
doaj-f753f66e88054f3ebfedf35636a8c423 |
---|---|
record_format |
Article |
spelling |
doaj-f753f66e88054f3ebfedf35636a8c4232020-11-25T01:08:21ZengFrontiers Media S.A.Frontiers in Genetics1664-80212019-12-011010.3389/fgene.2019.01196482860High-Dimensional Bayesian Network Inference From Systems Genetics Data Using Genetic Node OrderingLingfei Wang0Lingfei Wang1Lingfei Wang2Pieter Audenaert3Pieter Audenaert4Tom Michoel5Tom Michoel6Division of Genetics and Genomics, The Roslin Institute, The University of Edinburgh, Easter Bush Campus, Midlothian, United KingdomBroad Institute of Harvard and MIT, Cambridge, MA, United StatesDepartment of Molecular Biology, Massachusetts General Hospital, Boston, MA, United StatesIDLab, Ghent University—imec, Ghent, BelgiumBioinformatics Institute Ghent, Ghent University, Ghent, BelgiumDivision of Genetics and Genomics, The Roslin Institute, The University of Edinburgh, Easter Bush Campus, Midlothian, United KingdomComputational Biology Unit, Department of Informatics, University of Bergen, Bergen, NorwayStudying the impact of genetic variation on gene regulatory networks is essential to understand the biological mechanisms by which genetic variation causes variation in phenotypes. Bayesian networks provide an elegant statistical approach for multi-trait genetic mapping and modelling causal trait relationships. However, inferring Bayesian gene networks from high-dimensional genetics and genomics data is challenging, because the number of possible networks scales super-exponentially with the number of nodes, and the computational cost of conventional Bayesian network inference methods quickly becomes prohibitive. We propose an alternative method to infer high-quality Bayesian gene networks that easily scales to thousands of genes. Our method first reconstructs a node ordering by conducting pairwise causal inference tests between genes, which then allows to infer a Bayesian network via a series of independent variable selection problems, one for each gene. We demonstrate using simulated and real systems genetics data that this results in a Bayesian network with equal, and sometimes better, likelihood than the conventional methods, while having a significantly higher overlap with groundtruth networks and being orders of magnitude faster. Moreover our method allows for a unified false discovery rate control across genes and individual edges, and thus a rigorous and easily interpretable way for tuning the sparsity level of the inferred network. Bayesian network inference using pairwise node ordering is a highly efficient approach for reconstructing gene regulatory networks when prior information for the inclusion of edges exists or can be inferred from the available data.https://www.frontiersin.org/article/10.3389/fgene.2019.01196/fullsystems geneticsnetwork inferenceBayesian networkexpression quantitative trait loci analysisgene expression |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Lingfei Wang Lingfei Wang Lingfei Wang Pieter Audenaert Pieter Audenaert Tom Michoel Tom Michoel |
spellingShingle |
Lingfei Wang Lingfei Wang Lingfei Wang Pieter Audenaert Pieter Audenaert Tom Michoel Tom Michoel High-Dimensional Bayesian Network Inference From Systems Genetics Data Using Genetic Node Ordering Frontiers in Genetics systems genetics network inference Bayesian network expression quantitative trait loci analysis gene expression |
author_facet |
Lingfei Wang Lingfei Wang Lingfei Wang Pieter Audenaert Pieter Audenaert Tom Michoel Tom Michoel |
author_sort |
Lingfei Wang |
title |
High-Dimensional Bayesian Network Inference From Systems Genetics Data Using Genetic Node Ordering |
title_short |
High-Dimensional Bayesian Network Inference From Systems Genetics Data Using Genetic Node Ordering |
title_full |
High-Dimensional Bayesian Network Inference From Systems Genetics Data Using Genetic Node Ordering |
title_fullStr |
High-Dimensional Bayesian Network Inference From Systems Genetics Data Using Genetic Node Ordering |
title_full_unstemmed |
High-Dimensional Bayesian Network Inference From Systems Genetics Data Using Genetic Node Ordering |
title_sort |
high-dimensional bayesian network inference from systems genetics data using genetic node ordering |
publisher |
Frontiers Media S.A. |
series |
Frontiers in Genetics |
issn |
1664-8021 |
publishDate |
2019-12-01 |
description |
Studying the impact of genetic variation on gene regulatory networks is essential to understand the biological mechanisms by which genetic variation causes variation in phenotypes. Bayesian networks provide an elegant statistical approach for multi-trait genetic mapping and modelling causal trait relationships. However, inferring Bayesian gene networks from high-dimensional genetics and genomics data is challenging, because the number of possible networks scales super-exponentially with the number of nodes, and the computational cost of conventional Bayesian network inference methods quickly becomes prohibitive. We propose an alternative method to infer high-quality Bayesian gene networks that easily scales to thousands of genes. Our method first reconstructs a node ordering by conducting pairwise causal inference tests between genes, which then allows to infer a Bayesian network via a series of independent variable selection problems, one for each gene. We demonstrate using simulated and real systems genetics data that this results in a Bayesian network with equal, and sometimes better, likelihood than the conventional methods, while having a significantly higher overlap with groundtruth networks and being orders of magnitude faster. Moreover our method allows for a unified false discovery rate control across genes and individual edges, and thus a rigorous and easily interpretable way for tuning the sparsity level of the inferred network. Bayesian network inference using pairwise node ordering is a highly efficient approach for reconstructing gene regulatory networks when prior information for the inclusion of edges exists or can be inferred from the available data. |
topic |
systems genetics network inference Bayesian network expression quantitative trait loci analysis gene expression |
url |
https://www.frontiersin.org/article/10.3389/fgene.2019.01196/full |
work_keys_str_mv |
AT lingfeiwang highdimensionalbayesiannetworkinferencefromsystemsgeneticsdatausinggeneticnodeordering AT lingfeiwang highdimensionalbayesiannetworkinferencefromsystemsgeneticsdatausinggeneticnodeordering AT lingfeiwang highdimensionalbayesiannetworkinferencefromsystemsgeneticsdatausinggeneticnodeordering AT pieteraudenaert highdimensionalbayesiannetworkinferencefromsystemsgeneticsdatausinggeneticnodeordering AT pieteraudenaert highdimensionalbayesiannetworkinferencefromsystemsgeneticsdatausinggeneticnodeordering AT tommichoel highdimensionalbayesiannetworkinferencefromsystemsgeneticsdatausinggeneticnodeordering AT tommichoel highdimensionalbayesiannetworkinferencefromsystemsgeneticsdatausinggeneticnodeordering |
_version_ |
1725183035779842048 |