Learning Subject-Specific Directed Acyclic Graphs With Mixed Effects Structural Equation Models From Observational Data
The identification of causal relationships between random variables from large-scale observational data using directed acyclic graphs (DAG) is highly challenging. We propose a new mixed-effects structural equation model (mSEM) framework to estimate subject-specific DAGs, where we represent joint dis...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2018-10-01
|
Series: | Frontiers in Genetics |
Subjects: | |
Online Access: | https://www.frontiersin.org/article/10.3389/fgene.2018.00430/full |
id |
doaj-6211291622cb4af79ce5cf4024e4167e |
---|---|
record_format |
Article |
spelling |
doaj-6211291622cb4af79ce5cf4024e4167e2020-11-25T02:26:02ZengFrontiers Media S.A.Frontiers in Genetics1664-80212018-10-01910.3389/fgene.2018.00430410326Learning Subject-Specific Directed Acyclic Graphs With Mixed Effects Structural Equation Models From Observational DataXiang Li0Shanghong Xie1Peter McColgan2Sarah J. Tabrizi3Rachael I. Scahill4Donglin Zeng5Yuanjia Wang6Yuanjia Wang7Statistics and Decision Sciences, Janssen Research and Development, LLC, Raritan, NJ, United StatesDepartment of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY, United StatesNational Hospital for Neurology and Neurosurgery, London, United KingdomNational Hospital for Neurology and Neurosurgery, London, United KingdomNational Hospital for Neurology and Neurosurgery, London, United KingdomDepartment of Biostatistics, University of North Carolina, Chapel Hill, NC, United StatesDepartment of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY, United StatesDepartments of Psychiatry, Columbia University Medical Center, New York, NY, United StatesThe identification of causal relationships between random variables from large-scale observational data using directed acyclic graphs (DAG) is highly challenging. We propose a new mixed-effects structural equation model (mSEM) framework to estimate subject-specific DAGs, where we represent joint distribution of random variables in the DAG as a set of structural causal equations with mixed effects. The directed edges between nodes depend on observed exogenous covariates on each of the individual and unobserved latent variables. The strength of the connection is decomposed into a fixed-effect term representing the average causal effect given the covariates and a random effect term representing the latent causal effect due to unobserved pathways. The advantage of such decomposition is to capture essential asymmetric structural information and heterogeneity between DAGs in order to allow for the identification of causal structure with observational data. In addition, by pooling information across subject-specific DAGs, we can identify causal structure with a high probability and estimate subject-specific networks with a high precision. We propose a penalized likelihood-based approach to handle multi-dimensionality of the DAG model. We propose a fast, iterative computational algorithm, DAG-MM, to estimate parameters in mSEM and achieve desirable sparsity by hard-thresholding the edges. We theoretically prove the identifiability of mSEM. Using simulations and an application to protein signaling data, we show substantially improved performances when compared to existing methods and consistent results with a network estimated from interventional data. Lastly, we identify gray matter atrophy networks in regions of brain from patients with Huntington's disease and corroborate our findings using white matter connectivity data collected from an independent study.https://www.frontiersin.org/article/10.3389/fgene.2018.00430/fullgraphical modelsnetwork analysiscausal structure discoveryheterogeneityregularization |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Xiang Li Shanghong Xie Peter McColgan Sarah J. Tabrizi Rachael I. Scahill Donglin Zeng Yuanjia Wang Yuanjia Wang |
spellingShingle |
Xiang Li Shanghong Xie Peter McColgan Sarah J. Tabrizi Rachael I. Scahill Donglin Zeng Yuanjia Wang Yuanjia Wang Learning Subject-Specific Directed Acyclic Graphs With Mixed Effects Structural Equation Models From Observational Data Frontiers in Genetics graphical models network analysis causal structure discovery heterogeneity regularization |
author_facet |
Xiang Li Shanghong Xie Peter McColgan Sarah J. Tabrizi Rachael I. Scahill Donglin Zeng Yuanjia Wang Yuanjia Wang |
author_sort |
Xiang Li |
title |
Learning Subject-Specific Directed Acyclic Graphs With Mixed Effects Structural Equation Models From Observational Data |
title_short |
Learning Subject-Specific Directed Acyclic Graphs With Mixed Effects Structural Equation Models From Observational Data |
title_full |
Learning Subject-Specific Directed Acyclic Graphs With Mixed Effects Structural Equation Models From Observational Data |
title_fullStr |
Learning Subject-Specific Directed Acyclic Graphs With Mixed Effects Structural Equation Models From Observational Data |
title_full_unstemmed |
Learning Subject-Specific Directed Acyclic Graphs With Mixed Effects Structural Equation Models From Observational Data |
title_sort |
learning subject-specific directed acyclic graphs with mixed effects structural equation models from observational data |
publisher |
Frontiers Media S.A. |
series |
Frontiers in Genetics |
issn |
1664-8021 |
publishDate |
2018-10-01 |
description |
The identification of causal relationships between random variables from large-scale observational data using directed acyclic graphs (DAG) is highly challenging. We propose a new mixed-effects structural equation model (mSEM) framework to estimate subject-specific DAGs, where we represent joint distribution of random variables in the DAG as a set of structural causal equations with mixed effects. The directed edges between nodes depend on observed exogenous covariates on each of the individual and unobserved latent variables. The strength of the connection is decomposed into a fixed-effect term representing the average causal effect given the covariates and a random effect term representing the latent causal effect due to unobserved pathways. The advantage of such decomposition is to capture essential asymmetric structural information and heterogeneity between DAGs in order to allow for the identification of causal structure with observational data. In addition, by pooling information across subject-specific DAGs, we can identify causal structure with a high probability and estimate subject-specific networks with a high precision. We propose a penalized likelihood-based approach to handle multi-dimensionality of the DAG model. We propose a fast, iterative computational algorithm, DAG-MM, to estimate parameters in mSEM and achieve desirable sparsity by hard-thresholding the edges. We theoretically prove the identifiability of mSEM. Using simulations and an application to protein signaling data, we show substantially improved performances when compared to existing methods and consistent results with a network estimated from interventional data. Lastly, we identify gray matter atrophy networks in regions of brain from patients with Huntington's disease and corroborate our findings using white matter connectivity data collected from an independent study. |
topic |
graphical models network analysis causal structure discovery heterogeneity regularization |
url |
https://www.frontiersin.org/article/10.3389/fgene.2018.00430/full |
work_keys_str_mv |
AT xiangli learningsubjectspecificdirectedacyclicgraphswithmixedeffectsstructuralequationmodelsfromobservationaldata AT shanghongxie learningsubjectspecificdirectedacyclicgraphswithmixedeffectsstructuralequationmodelsfromobservationaldata AT petermccolgan learningsubjectspecificdirectedacyclicgraphswithmixedeffectsstructuralequationmodelsfromobservationaldata AT sarahjtabrizi learningsubjectspecificdirectedacyclicgraphswithmixedeffectsstructuralequationmodelsfromobservationaldata AT rachaeliscahill learningsubjectspecificdirectedacyclicgraphswithmixedeffectsstructuralequationmodelsfromobservationaldata AT donglinzeng learningsubjectspecificdirectedacyclicgraphswithmixedeffectsstructuralequationmodelsfromobservationaldata AT yuanjiawang learningsubjectspecificdirectedacyclicgraphswithmixedeffectsstructuralequationmodelsfromobservationaldata AT yuanjiawang learningsubjectspecificdirectedacyclicgraphswithmixedeffectsstructuralequationmodelsfromobservationaldata |
_version_ |
1724848829959766016 |