Measuring Similarity among Protein Sequences Using a New Descriptor

The comparison of protein sequences according to similarity is a fundamental aspect of today’s biomedical research. With the developments of sequencing technologies, a large number of protein sequences increase exponentially in the public databases. Famous sequences’ comparison methods are alignment...

Full description

Bibliographic Details
Main Authors:	Mervat M. Abo-Elkhier, Marwa A. Abd Elwahaab, Moheb I. Abo El Maaty
Format:	Article
Language:	English
Published:	Hindawi Limited 2019-01-01
Series:	BioMed Research International
Online Access:	http://dx.doi.org/10.1155/2019/2796971

id	doaj-62188dfe9e814689a9f859fa494f341f
record_format	Article
spelling	doaj-62188dfe9e814689a9f859fa494f341f2020-11-25T02:31:04ZengHindawi LimitedBioMed Research International2314-61332314-61412019-01-01201910.1155/2019/27969712796971Measuring Similarity among Protein Sequences Using a New DescriptorMervat M. Abo-Elkhier0Marwa A. Abd Elwahaab1Moheb I. Abo El Maaty2Department of Engineering Mathematics and Physics, Faculty of Engineering, Mansoura University, Mansoura 35516, EgyptDepartment of Engineering Mathematics and Physics, Faculty of Engineering, Mansoura University, Mansoura 35516, EgyptDepartment of Engineering Mathematics and Physics, Faculty of Engineering, Mansoura University, Mansoura 35516, EgyptThe comparison of protein sequences according to similarity is a fundamental aspect of today’s biomedical research. With the developments of sequencing technologies, a large number of protein sequences increase exponentially in the public databases. Famous sequences’ comparison methods are alignment based. They generally give excellent results when the sequences under study are closely related and they are time consuming. Herein, a new alignment-free method is introduced. Our technique depends on a new graphical representation and descriptor. The graphical representation of protein sequence is a simple way to visualize protein sequences. The descriptor compresses the primary sequence into a single vector composed of only two values. Our approach gives good results with both short and long sequences within a little computation time. It is applied on nine beta globin, nine ND5 (NADH dehydrogenase subunit 5), and 24 spike protein sequences. Correlation and significance analyses are also introduced to compare our similarity/dissimilarity results with others’ approaches, results, and sequence homology.http://dx.doi.org/10.1155/2019/2796971
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Mervat M. Abo-Elkhier Marwa A. Abd Elwahaab Moheb I. Abo El Maaty
spellingShingle	Mervat M. Abo-Elkhier Marwa A. Abd Elwahaab Moheb I. Abo El Maaty Measuring Similarity among Protein Sequences Using a New Descriptor BioMed Research International
author_facet	Mervat M. Abo-Elkhier Marwa A. Abd Elwahaab Moheb I. Abo El Maaty
author_sort	Mervat M. Abo-Elkhier
title	Measuring Similarity among Protein Sequences Using a New Descriptor
title_short	Measuring Similarity among Protein Sequences Using a New Descriptor
title_full	Measuring Similarity among Protein Sequences Using a New Descriptor
title_fullStr	Measuring Similarity among Protein Sequences Using a New Descriptor
title_full_unstemmed	Measuring Similarity among Protein Sequences Using a New Descriptor
title_sort	measuring similarity among protein sequences using a new descriptor
publisher	Hindawi Limited
series	BioMed Research International
issn	2314-6133 2314-6141
publishDate	2019-01-01
description	The comparison of protein sequences according to similarity is a fundamental aspect of today’s biomedical research. With the developments of sequencing technologies, a large number of protein sequences increase exponentially in the public databases. Famous sequences’ comparison methods are alignment based. They generally give excellent results when the sequences under study are closely related and they are time consuming. Herein, a new alignment-free method is introduced. Our technique depends on a new graphical representation and descriptor. The graphical representation of protein sequence is a simple way to visualize protein sequences. The descriptor compresses the primary sequence into a single vector composed of only two values. Our approach gives good results with both short and long sequences within a little computation time. It is applied on nine beta globin, nine ND5 (NADH dehydrogenase subunit 5), and 24 spike protein sequences. Correlation and significance analyses are also introduced to compare our similarity/dissimilarity results with others’ approaches, results, and sequence homology.
url	http://dx.doi.org/10.1155/2019/2796971
work_keys_str_mv	AT mervatmaboelkhier measuringsimilarityamongproteinsequencesusinganewdescriptor AT marwaaabdelwahaab measuringsimilarityamongproteinsequencesusinganewdescriptor AT mohebiaboelmaaty measuringsimilarityamongproteinsequencesusinganewdescriptor
_version_	1724825477949947904

Measuring Similarity among Protein Sequences Using a New Descriptor

Similar Items