Summary: | High-throughput technologies in biological sciences have led to an exponential growth in the amount of data generated over the past several years. This data explosion is forcing scientists to search for innovative computational designs to reduce the time-scale of biological system simulations, and enable rapid study of larger and more complex biological systems. In the field of immunobiology, one such simulation is known as DNA recombination. It is a critical process for investigating the correlation between disease and immune system responses, and discovering the immunological changes that occur during aging through T-cell repertoire analysis. In this project we design and develop a massively parallel method tailored for Graphics Processing Unit (GPU) processors by identifying novel ways of restructuring the flow of the repertoire analysis. The DNA recombination process is the central mechanism for generating diversity among antigen receptors such as T-cell receptors (TCRs). This diversity is crucial for the development of the adaptive immune system. However, modeling of all the α β TCR sequences is encumbered by the enormity of the potential repertoire, which has been predicted to exceed 10¹⁵ sequences. Prior modeling efforts have, therefore, been limited to extrapolations based on the analysis of minor subsets of the overall TCR β repertoire. In this study, we map the recombination process completely onto the GPU hardware architecture using the CUDA programming environment to circumvent prior limitations. For the first time, a model of the mouse TCRβ is presented to an extent which enabled the evaluation of the Convergent Recombination Hypothesis (CRH) comprehensively at a peta-scale level on a single GPU. Understanding the recombination process will allow scientists to better determine the likelihood of transplant rejections, immune system responses to foreign antigens and cancers, and plan treatments based on the genetic makeup of a given patient.
|