Deriving a scoring matrix for mapping protein local structure and sequence

碩士 === 國立陽明大學 === 生物資訊研究所 === 94 === The correlation between protein local structure and sequence was low (r ~ -0.12) when one matches them using existing scoring matrices for amino acid sequence similarity. Here we improve the correlation by a new amino acid substitution scoring matrix. We created...

Full description

Bibliographic Details
Main Authors: Chia-Chuan Liu, 劉家銓
Other Authors: Ming-Jing Hwang
Format: Others
Language:en_US
Published: 2006
Online Access:http://ndltd.ncl.edu.tw/handle/83868168879023365235
id ndltd-TW-094YM005112001
record_format oai_dc
spelling ndltd-TW-094YM0051120012015-10-13T16:31:15Z http://ndltd.ncl.edu.tw/handle/83868168879023365235 Deriving a scoring matrix for mapping protein local structure and sequence 以氨基酸計分矩陣進行蛋白質局部結構與序列映對 Chia-Chuan Liu 劉家銓 碩士 國立陽明大學 生物資訊研究所 94 The correlation between protein local structure and sequence was low (r ~ -0.12) when one matches them using existing scoring matrices for amino acid sequence similarity. Here we improve the correlation by a new amino acid substitution scoring matrix. We created fragment pairs chosen randomly from PDBselect 25 (a set of protein structures with sequence identity less than 25%) and used Genetic Algorithm (GA) to optimize the correlation. In our results, the GA-optimized scoring matrix for fragment length of 5, 7, and 9 amino acids achieved a much better correlation (r ~ -0.5). The same approach was then applied for local structure classification using the I-sites library as a test set, which is a set of sequence patterns that strongly correlate with protein structure at the local level. The GA-optimized scoring matrix again achieved better results. Thus, in this work we have developed a GA-based approach that can produce amino acid substitution matrices suitable for mapping protein local structure and sequence. Ming-Jing Hwang 黃明經 2006 學位論文 ; thesis 49 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立陽明大學 === 生物資訊研究所 === 94 === The correlation between protein local structure and sequence was low (r ~ -0.12) when one matches them using existing scoring matrices for amino acid sequence similarity. Here we improve the correlation by a new amino acid substitution scoring matrix. We created fragment pairs chosen randomly from PDBselect 25 (a set of protein structures with sequence identity less than 25%) and used Genetic Algorithm (GA) to optimize the correlation. In our results, the GA-optimized scoring matrix for fragment length of 5, 7, and 9 amino acids achieved a much better correlation (r ~ -0.5). The same approach was then applied for local structure classification using the I-sites library as a test set, which is a set of sequence patterns that strongly correlate with protein structure at the local level. The GA-optimized scoring matrix again achieved better results. Thus, in this work we have developed a GA-based approach that can produce amino acid substitution matrices suitable for mapping protein local structure and sequence.
author2 Ming-Jing Hwang
author_facet Ming-Jing Hwang
Chia-Chuan Liu
劉家銓
author Chia-Chuan Liu
劉家銓
spellingShingle Chia-Chuan Liu
劉家銓
Deriving a scoring matrix for mapping protein local structure and sequence
author_sort Chia-Chuan Liu
title Deriving a scoring matrix for mapping protein local structure and sequence
title_short Deriving a scoring matrix for mapping protein local structure and sequence
title_full Deriving a scoring matrix for mapping protein local structure and sequence
title_fullStr Deriving a scoring matrix for mapping protein local structure and sequence
title_full_unstemmed Deriving a scoring matrix for mapping protein local structure and sequence
title_sort deriving a scoring matrix for mapping protein local structure and sequence
publishDate 2006
url http://ndltd.ncl.edu.tw/handle/83868168879023365235
work_keys_str_mv AT chiachuanliu derivingascoringmatrixformappingproteinlocalstructureandsequence
AT liújiāquán derivingascoringmatrixformappingproteinlocalstructureandsequence
AT chiachuanliu yǐānjīsuānjìfēnjǔzhènjìnxíngdànbáizhìjúbùjiégòuyǔxùlièyìngduì
AT liújiāquán yǐānjīsuānjìfēnjǔzhènjìnxíngdànbáizhìjúbùjiégòuyǔxùlièyìngduì
_version_ 1717771188788264960