Applying Structural Domain Information for Enzyme Reaction Annotation and Protein-Protein Interaction Inference
博士 === 國立清華大學 === 資訊工程學系 === 102 === Domains are fundamental building blocks of proteins which perform a variety of functions within living organisms, including catalysis, signal transduction, and transport of nutrients. The majority of proteins are composed of more than two domains that recognize a...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2013
|
Online Access: | http://ndltd.ncl.edu.tw/handle/59906111277978423707 |
id |
ndltd-TW-102NTHU5392029 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-102NTHU53920292016-07-01T04:28:18Z http://ndltd.ncl.edu.tw/handle/59906111277978423707 Applying Structural Domain Information for Enzyme Reaction Annotation and Protein-Protein Interaction Inference 應用蛋白質片段資訊進行酵素功能標註及推論蛋白質間交互作用關係 Huang, Chuan-Ching 黃筌敬 博士 國立清華大學 資訊工程學系 102 Domains are fundamental building blocks of proteins which perform a variety of functions within living organisms, including catalysis, signal transduction, and transport of nutrients. The majority of proteins are composed of more than two domains that recognize and bind structural units in other proteins through protein-protein interactions. This dissertation uses the nature of domains in the proteins to investigate two main topics including “enzyme reaction prediction based on the domain architecture of an enzyme” and “inferring protein-protein interactions (PPIs) from domain-domain interactions (DDIs)”. The gap between novel protein sequences and characterized protein functions has been widened according to the advent of high-throughput genome sequencing techniques in the post-genomics era. To identify functions of a protein from manually curated sequence annotation is a challenging task; therefore, automated protein function prediction techniques are necessary. The enzyme nomenclature proposed by the International Union of Biochemistry and Molecular Biology has provided a well-defined four-field number on enzyme classification. The first three numbers of an enzyme reaction describe the overtype of enzymatic reaction, and the last number denotes the substrate specificity of a reaction. Proteins are grouped into two data sets, comprising the 3-numerical-block set and the 4-numerical-block set. According to whether the protein performed more than one enzymatic reaction, each data set was further divided into single-EC cases and multiple-EC cases. For the case of single-EC, the fractions of entries correctly classified using the well-known association rule method reached 96% and 91% accuracy for the 3-numerical-block set and the 4-numerical-block set, respectively. The proposed enzyme reaction prediction (ERP) method showed marginally higher accuracy, with 99% and 92% separately. It is more difficult to predict multiple enzymatic activities for a single protein. For the case of multiple-EC, the fractions of entries correctly predicted for the 3-numerical-block set and the 4-numerical-block set were 17% and 8%, respectively, for the association rule method, and 49% and 42%, respectively, for the ERP method. Biological processes could be carried out when one protein recognize and bind certain structural elements in other proteins through PPIs. Therefore, it is possible to explore protein functions from protein interactions at domain level. Noroviruses cause severe gastroenteritis and foodborne illness during the winter worldwide. There is no efficient vaccine for Noroviruses because of their variable genome sequences. Vulnerable populations suffer from Noroviruses often require hospitalization and may die. We attempted to build the protein interaction network from the domain level for clinical applications and drug design further. Tang, Chuan Yi 唐傳義 2013 學位論文 ; thesis 84 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
博士 === 國立清華大學 === 資訊工程學系 === 102 === Domains are fundamental building blocks of proteins which perform a variety of functions within living organisms, including catalysis, signal transduction, and transport of nutrients. The majority of proteins are composed of more than two domains that recognize and bind structural units in other proteins through protein-protein interactions. This dissertation uses the nature of domains in the proteins to investigate two main topics including “enzyme reaction prediction based on the domain architecture of an enzyme” and “inferring protein-protein interactions (PPIs) from domain-domain interactions (DDIs)”.
The gap between novel protein sequences and characterized protein functions has been widened according to the advent of high-throughput genome sequencing techniques in the post-genomics era. To identify functions of a protein from manually curated sequence annotation is a challenging task; therefore, automated protein function prediction techniques are necessary. The enzyme nomenclature proposed by the International Union of Biochemistry and Molecular Biology has provided a well-defined four-field number on enzyme classification. The first three numbers of an enzyme reaction describe the overtype of enzymatic reaction, and the last number denotes the substrate specificity of a reaction. Proteins are grouped into two data sets, comprising the 3-numerical-block set and the 4-numerical-block set. According to whether the protein performed more than one enzymatic reaction, each data set was further divided into single-EC cases and multiple-EC cases. For the case of single-EC, the fractions of entries correctly classified using the well-known association rule method reached 96% and 91% accuracy for the 3-numerical-block set and the 4-numerical-block set, respectively. The proposed enzyme reaction prediction (ERP) method showed marginally higher accuracy, with 99% and 92% separately. It is more difficult to predict multiple enzymatic activities for a single protein. For the case of multiple-EC, the fractions of entries correctly predicted for the 3-numerical-block set and the 4-numerical-block set were 17% and 8%, respectively, for the association rule method, and 49% and 42%, respectively, for the ERP method.
Biological processes could be carried out when one protein recognize and bind certain structural elements in other proteins through PPIs. Therefore, it is possible to explore protein functions from protein interactions at domain level. Noroviruses cause severe gastroenteritis and foodborne illness during the winter worldwide. There is no efficient vaccine for Noroviruses because of their variable genome sequences. Vulnerable populations suffer from Noroviruses often require hospitalization and may die. We attempted to build the protein interaction network from the domain level for clinical applications and drug design further.
|
author2 |
Tang, Chuan Yi |
author_facet |
Tang, Chuan Yi Huang, Chuan-Ching 黃筌敬 |
author |
Huang, Chuan-Ching 黃筌敬 |
spellingShingle |
Huang, Chuan-Ching 黃筌敬 Applying Structural Domain Information for Enzyme Reaction Annotation and Protein-Protein Interaction Inference |
author_sort |
Huang, Chuan-Ching |
title |
Applying Structural Domain Information for Enzyme Reaction Annotation and Protein-Protein Interaction Inference |
title_short |
Applying Structural Domain Information for Enzyme Reaction Annotation and Protein-Protein Interaction Inference |
title_full |
Applying Structural Domain Information for Enzyme Reaction Annotation and Protein-Protein Interaction Inference |
title_fullStr |
Applying Structural Domain Information for Enzyme Reaction Annotation and Protein-Protein Interaction Inference |
title_full_unstemmed |
Applying Structural Domain Information for Enzyme Reaction Annotation and Protein-Protein Interaction Inference |
title_sort |
applying structural domain information for enzyme reaction annotation and protein-protein interaction inference |
publishDate |
2013 |
url |
http://ndltd.ncl.edu.tw/handle/59906111277978423707 |
work_keys_str_mv |
AT huangchuanching applyingstructuraldomaininformationforenzymereactionannotationandproteinproteininteractioninference AT huángquánjìng applyingstructuraldomaininformationforenzymereactionannotationandproteinproteininteractioninference AT huangchuanching yīngyòngdànbáizhìpiànduànzīxùnjìnxíngjiàosùgōngnéngbiāozhùjítuīlùndànbáizhìjiānjiāohùzuòyòngguānxì AT huángquánjìng yīngyòngdànbáizhìpiànduànzīxùnjìnxíngjiàosùgōngnéngbiāozhùjítuīlùndànbáizhìjiānjiāohùzuòyòngguānxì |
_version_ |
1718331115001872384 |