G-MAPSEQ – a new method for mapping reads to a reference genome

The problem of reads mapping to a reference genome is one of the most essential problems in modern computational biology. The most popular algorithms used to solve this problem are based on the Burrows-Wheeler transform and the FM-index. However, this causes some issues with highly mutated sequences...

Full description

Bibliographic Details
Main Authors: Wojciechowski Pawel, Frohmberg Wojciech, Kierzynka Michal, Zurkowski Piotr, Blazewicz Jacek
Format: Article
Language:English
Published: Sciendo 2016-06-01
Series:Foundations of Computing and Decision Sciences
Subjects:
Online Access:https://doi.org/10.1515/fcds-2016-0007
Description
Summary:The problem of reads mapping to a reference genome is one of the most essential problems in modern computational biology. The most popular algorithms used to solve this problem are based on the Burrows-Wheeler transform and the FM-index. However, this causes some issues with highly mutated sequences due to a limited number of mutations allowed. G-MAPSEQ is a novel, hybrid algorithm combining two interesting methods: alignment-free sequence comparison and an ultra fast sequence alignment. The former is a fast heuristic algorithm which uses k-mer characteristics of nucleotide sequences to find potential mapping places. The latter is a very fast GPU implementation of sequence alignment used to verify the correctness of these mapping positions. The source code of G-MAPSEQ along with other bioinformatic software is available at: http://gpualign.cs.put.poznan.pl.
ISSN:2300-3405