Alignment-free clustering of large data sets of unannotated protein conserved regions using minhashing

Abstract Background Clustering of protein sequences is of key importance in predicting the structure and function of newly sequenced proteins and is also of use for their annotation. With the advent of multiple high-throughput sequencing technologies, new protein sequences are becoming available at...

Full description

Bibliographic Details
Main Authors: Armen Abnousi, Shira L. Broschat, Ananth Kalyanaraman
Format: Article
Language:English
Published: BMC 2018-03-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-018-2080-y