Constraint on Rare Protein-Coding Variation: Pathogenicity Prediction and Phenotypic Discovery

Patterns of genetic variation along the human genome provide insight into functional and evolutionary constraints on different loci. Quantifying these patterns of constraint improves our ability to identify functional regions and interpret the phenotypic effects of genetic mutations. Building on exo...

Full description

Bibliographic Details
Main Author: Sivley, Robert Michael
Other Authors: Antonis Rokas
Format: Others
Language:en
Published: VANDERBILT 2018
Subjects:
Online Access:http://etd.library.vanderbilt.edu/available/etd-01152018-141952/
id ndltd-VANDERBILT-oai-VANDERBILTETD-etd-01152018-141952
record_format oai_dc
spelling ndltd-VANDERBILT-oai-VANDERBILTETD-etd-01152018-1419522018-01-23T05:10:57Z Constraint on Rare Protein-Coding Variation: Pathogenicity Prediction and Phenotypic Discovery Sivley, Robert Michael Biomedical Informatics Patterns of genetic variation along the human genome provide insight into functional and evolutionary constraints on different loci. Quantifying these patterns of constraint improves our ability to identify functional regions and interpret the phenotypic effects of genetic mutations. Building on exome-sequencing data from tens of thousands of individuals, we are now able to quantify constraint on a large scale. In this work, we explore three avenues by which constraint on rare protein-coding variation can be used to better understand human biology and elucidate the genetic drivers of disease. We first present a novel algorithm to classify variants of unknown significance (VUS) using patterns of spatial constraint on disease-causing variation in protein structure. We demonstrate its utility in classifying VUS in RTEL1, a helicase protein, from patients with familial interstitial pneumonia. Next, we quantify spatial constraint on somatic mutations in 3D protein structures and identify patterns indicative of driver mutations in several proteins. Finally, we perform phenome-wide association studies (PheWAS) to interrogate the phenotypic impact of rare protein-coding variants in genes intolerant to loss-of-function mutations. This dissertation makes significant advances in our understanding of how evolutionary constraint on protein-coding genetic variants is related to their contribution to human disease. In particular, we leveraged this progress to develop powerful approaches to variant pathogenicity prediction, the detection of putative driver mutations in cancer, and the identification of novel phenotype associations for highly constrained genes. Antonis Rokas Jonathan Kropski Jens Meiler William S. Bush John A. Capra VANDERBILT 2018-01-22 text application/pdf http://etd.library.vanderbilt.edu/available/etd-01152018-141952/ http://etd.library.vanderbilt.edu/available/etd-01152018-141952/ en unrestricted I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to Vanderbilt University or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report.
collection NDLTD
language en
format Others
sources NDLTD
topic Biomedical Informatics
spellingShingle Biomedical Informatics
Sivley, Robert Michael
Constraint on Rare Protein-Coding Variation: Pathogenicity Prediction and Phenotypic Discovery
description Patterns of genetic variation along the human genome provide insight into functional and evolutionary constraints on different loci. Quantifying these patterns of constraint improves our ability to identify functional regions and interpret the phenotypic effects of genetic mutations. Building on exome-sequencing data from tens of thousands of individuals, we are now able to quantify constraint on a large scale. In this work, we explore three avenues by which constraint on rare protein-coding variation can be used to better understand human biology and elucidate the genetic drivers of disease. We first present a novel algorithm to classify variants of unknown significance (VUS) using patterns of spatial constraint on disease-causing variation in protein structure. We demonstrate its utility in classifying VUS in RTEL1, a helicase protein, from patients with familial interstitial pneumonia. Next, we quantify spatial constraint on somatic mutations in 3D protein structures and identify patterns indicative of driver mutations in several proteins. Finally, we perform phenome-wide association studies (PheWAS) to interrogate the phenotypic impact of rare protein-coding variants in genes intolerant to loss-of-function mutations. This dissertation makes significant advances in our understanding of how evolutionary constraint on protein-coding genetic variants is related to their contribution to human disease. In particular, we leveraged this progress to develop powerful approaches to variant pathogenicity prediction, the detection of putative driver mutations in cancer, and the identification of novel phenotype associations for highly constrained genes.
author2 Antonis Rokas
author_facet Antonis Rokas
Sivley, Robert Michael
author Sivley, Robert Michael
author_sort Sivley, Robert Michael
title Constraint on Rare Protein-Coding Variation: Pathogenicity Prediction and Phenotypic Discovery
title_short Constraint on Rare Protein-Coding Variation: Pathogenicity Prediction and Phenotypic Discovery
title_full Constraint on Rare Protein-Coding Variation: Pathogenicity Prediction and Phenotypic Discovery
title_fullStr Constraint on Rare Protein-Coding Variation: Pathogenicity Prediction and Phenotypic Discovery
title_full_unstemmed Constraint on Rare Protein-Coding Variation: Pathogenicity Prediction and Phenotypic Discovery
title_sort constraint on rare protein-coding variation: pathogenicity prediction and phenotypic discovery
publisher VANDERBILT
publishDate 2018
url http://etd.library.vanderbilt.edu/available/etd-01152018-141952/
work_keys_str_mv AT sivleyrobertmichael constraintonrareproteincodingvariationpathogenicitypredictionandphenotypicdiscovery
_version_ 1718611843786735616