Summary: | Thesis (Ph. D.)--Massachusetts Institute of Technology, Biological Engineering Division, 2003. === Includes bibliographical references (p. 211-223). === Glycosaminoglycans (GAGs) are a family of complex carbohydrates whose known biological roles have dramatically increased over the recent years. It is now becoming increasingly evident that sequence specific GAG-protein interactions play critical roles in cell growth, development, angiogenesis, cancer, anticoagulation and microbial pathogenesis. Therefore it is important to understand the specificity of glycan-protein interactions and how these specific interactions influence their structure-function relationships. This thesis addresses many challenges in this emerging area of glycomics, by taking an integrated approach that couples biophysical and biochemical methods with a bioinformatics framework to represent and process sequence information content in GAGs. With this motivation the thesis is divided into 4 components 1. Using heparin/heparan sulfate GAGs (HSGAGs) - fibroblast growth factor (FGF) as a model GAG-protein system, the first part focuses on determining the structural basis of FGF oligomerization, sequence specific FGF-HSGAG interactions and FGF-receptor (FGFR) interactions which collectively influence the specificity of HSGAG mediated FGF signaling. 2. The second part focuses on developing enzymatic tools for analysis of GAGs using chondroitinase B, HSGAG 2-0 sulfatase and 3-0 sulfotransferase as model enzymes. For each of these enzymes a theoretical model for the enzyme-substrate structural complex is developed and it is coupled with the site directed mutagenesis and biochemical studies to determine its catalytic mechanism and substrate specificity. 3. To deal with the heterogeneity and high information density of GAG sequences, an informatics based approach to decode GAG sequence information has been developed. === (cont.) A new property encoded nomenclature (PEN) computational framework has been formulated to encode and process information content in GAG sequences. The numerical nature of the PEN code facilitated the incorporation of diverse data sets from different analytical methods including mass spectrometry, electrophoresis and NMR as constraints to accurately determine the sequence of GAGs. Two practical methodologies for sequencing GAGs have been developed based on this approach. 4. The last part outlines the development of a powerful relational database that is capable of bridging sequence, structure and function information in glycomics. Thus this database and its associated computational tools to search and mine the data is an important resource for advancing glycomics. === by Rahul Raman. === Ph.D.
|