Summary: | Transformations on a cellular level caused by changes in gene expression, protein abundance, or epigenetic features present in cells play a key role in differentiation, reprogramming and disease. Such transformations are frequently stochastic on a single-cell level. The result is a heterogeneous cell population with an ever-changing mixture. Often cells undergo transformation via intermediate stages, which further convolute the transformation process. Reliable high-throughput data is commonly obtained on a cell population level therefore elucidating the underlying single-cell process is challenging. In this thesis we present and analyse models that probe population level data to answer questions about the transformation process and to distinguish between states. We investigate a recently proposed stochastic model for transition processes called STAMM, which is based on a latent Markov chain at the single-cell level. We present a computationally efficient unbiased approach to estimation, model selection and setting of tuning parameters. To complement our understanding of properties and behaviour of the model we implement a single-cell simulation setup. This not only allows us to investigate parameter estimation but we can also explore behaviour under violations of model assumptions. We also empirically investigate identifiability of the model. We apply the model to oncogenic transformation where the data time-course consists of genome-wide RNA-seq measurements. We also compare results from application of STAMM to a stem cell reprogramming microarray time-course to single-cell measurements carried out independently. Results show that not only is the model robust under mild violations of assumptions but state specific results can be compared to single-cell measurements. Under stronger violation of assumptions transitions between states are not estimated well. The model is therefore especially useful to steer further experiments in the right direction. We then present a model that examines the response of cells in the cell cycle to incident radiation at different doses. Cells can either undergo programmed cell death or re-enter the cell cycle after an interruption. A genome-wide RNA-seq measurement is made at the initial time point and subsequently fractions of cells with contrasting cell-fates can be distinguished and counted. The model assigns a score to each gene corresponding to its importance in determining cell fate. We implement a single-cell level simulation procedure and carry out illustrative simulations for one gene and for four genes. Parameter estimation in this model allows distinguishing genes that are important from genes that are not. This is only possible as long as the noise level is not too high.
|