Summary: | Several diseases related to cell proliferation are characterized by the accumulation of somatic DNA changes, with respect to wild-type conditions. Cancer and HIV are 2 common examples of such diseases, where the mutational load in the cancerous/viral population increases over time. In these cases, selective pressures are often observed along with competition, co-operation, and parasitism among distinct cellular clones. Recently, we presented a mathematical framework to model these phenomena, based on a combination of Bayesian inference and Suppes’ theory of probabilistic causation, depicted in graphical structures dubbed Suppes-Bayes Causal Networks (SBCNs). The SBCNs are generative probabilistic graphical models that recapitulate the potential ordering of accumulation of such DNA changes during the progression of the disease. Such models can be inferred from data by exploiting likelihood-based model selection strategies with regularization. In this article, we discuss the theoretical foundations of our approach and we investigate in depth the influence on the model selection task of (1) the poset based on Suppes’ theory and (2) different regularization strategies. Furthermore, we provide an example of application of our framework to HIV genetic data highlighting the valuable insights provided by the inferred SBCN
|