Machine learning methods for prediction of disulphide bonding states of cysteine residues in proteins

The goal of this thesis work is to develop a computational method based on machine learning techniques for predicting disulfide-bonding states of cysteine residues in proteins, which is a sub-problem of a bigger and yet unsolved problem of protein structure prediction. Improvement in the prediction...

Full description

Bibliographic Details
Main Author: Shukla, Priyank <1984>
Other Authors: Casadio, Rita
Format: Doctoral Thesis
Language:en
Published: Alma Mater Studiorum - Università di Bologna 2010
Subjects:
Online Access:http://amsdottorato.unibo.it/2588/
id ndltd-unibo.it-oai-amsdottorato.cib.unibo.it-2588
record_format oai_dc
spelling ndltd-unibo.it-oai-amsdottorato.cib.unibo.it-25882014-03-24T16:28:34Z Machine learning methods for prediction of disulphide bonding states of cysteine residues in proteins Shukla, Priyank <1984> INF/01 Informatica The goal of this thesis work is to develop a computational method based on machine learning techniques for predicting disulfide-bonding states of cysteine residues in proteins, which is a sub-problem of a bigger and yet unsolved problem of protein structure prediction. Improvement in the prediction of disulfide bonding states of cysteine residues will help in putting a constraint in the three dimensional (3D) space of the respective protein structure, and thus will eventually help in the prediction of 3D structure of proteins. Results of this work will have direct implications in site-directed mutational studies of proteins, proteins engineering and the problem of protein folding. We have used a combination of Artificial Neural Network (ANN) and Hidden Markov Model (HMM), the so-called Hidden Neural Network (HNN) as a machine learning technique to develop our prediction method. By using different global and local features of proteins (specifically profiles, parity of cysteine residues, average cysteine conservation, correlated mutation, sub-cellular localization, and signal peptide) as inputs and considering Eukaryotes and Prokaryotes separately we have reached to a remarkable accuracy of 94% on cysteine basis for both Eukaryotic and Prokaryotic datasets, and an accuracy of 90% and 93% on protein basis for Eukaryotic dataset and Prokaryotic dataset respectively. These accuracies are best so far ever reached by any existing prediction methods, and thus our prediction method has outperformed all the previously developed approaches and therefore is more reliable. Most interesting part of this thesis work is the differences in the prediction performances of Eukaryotes and Prokaryotes at the basic level of input coding when ‘profile’ information was given as input to our prediction method. And one of the reasons for this we discover is the difference in the amino acid composition of the local environment of bonded and free cysteine residues in Eukaryotes and Prokaryotes. Eukaryotic bonded cysteine examples have a ‘symmetric-cysteine-rich’ environment, where as Prokaryotic bonded examples lack it. Alma Mater Studiorum - Università di Bologna Casadio, Rita 2010-05-05 Doctoral Thesis PeerReviewed application/pdf en http://amsdottorato.unibo.it/2588/ info:eu-repo/semantics/openAccess
collection NDLTD
language en
format Doctoral Thesis
sources NDLTD
topic INF/01 Informatica
spellingShingle INF/01 Informatica
Shukla, Priyank <1984>
Machine learning methods for prediction of disulphide bonding states of cysteine residues in proteins
description The goal of this thesis work is to develop a computational method based on machine learning techniques for predicting disulfide-bonding states of cysteine residues in proteins, which is a sub-problem of a bigger and yet unsolved problem of protein structure prediction. Improvement in the prediction of disulfide bonding states of cysteine residues will help in putting a constraint in the three dimensional (3D) space of the respective protein structure, and thus will eventually help in the prediction of 3D structure of proteins. Results of this work will have direct implications in site-directed mutational studies of proteins, proteins engineering and the problem of protein folding. We have used a combination of Artificial Neural Network (ANN) and Hidden Markov Model (HMM), the so-called Hidden Neural Network (HNN) as a machine learning technique to develop our prediction method. By using different global and local features of proteins (specifically profiles, parity of cysteine residues, average cysteine conservation, correlated mutation, sub-cellular localization, and signal peptide) as inputs and considering Eukaryotes and Prokaryotes separately we have reached to a remarkable accuracy of 94% on cysteine basis for both Eukaryotic and Prokaryotic datasets, and an accuracy of 90% and 93% on protein basis for Eukaryotic dataset and Prokaryotic dataset respectively. These accuracies are best so far ever reached by any existing prediction methods, and thus our prediction method has outperformed all the previously developed approaches and therefore is more reliable. Most interesting part of this thesis work is the differences in the prediction performances of Eukaryotes and Prokaryotes at the basic level of input coding when ‘profile’ information was given as input to our prediction method. And one of the reasons for this we discover is the difference in the amino acid composition of the local environment of bonded and free cysteine residues in Eukaryotes and Prokaryotes. Eukaryotic bonded cysteine examples have a ‘symmetric-cysteine-rich’ environment, where as Prokaryotic bonded examples lack it.
author2 Casadio, Rita
author_facet Casadio, Rita
Shukla, Priyank <1984>
author Shukla, Priyank <1984>
author_sort Shukla, Priyank <1984>
title Machine learning methods for prediction of disulphide bonding states of cysteine residues in proteins
title_short Machine learning methods for prediction of disulphide bonding states of cysteine residues in proteins
title_full Machine learning methods for prediction of disulphide bonding states of cysteine residues in proteins
title_fullStr Machine learning methods for prediction of disulphide bonding states of cysteine residues in proteins
title_full_unstemmed Machine learning methods for prediction of disulphide bonding states of cysteine residues in proteins
title_sort machine learning methods for prediction of disulphide bonding states of cysteine residues in proteins
publisher Alma Mater Studiorum - Università di Bologna
publishDate 2010
url http://amsdottorato.unibo.it/2588/
work_keys_str_mv AT shuklapriyank1984 machinelearningmethodsforpredictionofdisulphidebondingstatesofcysteineresiduesinproteins
_version_ 1716654132759101440