EXFI: Exon and splice graph prediction without a reference genome
Abstract For population genetic studies in nonmodel organisms, it is important to use every single source of genomic information. This paper presents EXFI, a Python pipeline that predicts the splice graph and exon sequences using an assembled transcriptome and raw whole‐genome sequencing reads. The...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2020-08-01
|
Series: | Ecology and Evolution |
Subjects: | |
Online Access: | https://doi.org/10.1002/ece3.6587 |
id |
doaj-185b7b7ae6f6495cb1a21286036dbfb4 |
---|---|
record_format |
Article |
spelling |
doaj-185b7b7ae6f6495cb1a21286036dbfb42021-04-02T09:27:12ZengWileyEcology and Evolution2045-77582020-08-0110168880889310.1002/ece3.6587EXFI: Exon and splice graph prediction without a reference genomeJorge Langa0Andone Estonba1Darrell Conklin2Department of Genetics, Physical Anthropology and Animal Physiology Faculty of Science and Technology University of the Basque Country Leioa SpainDepartment of Genetics, Physical Anthropology and Animal Physiology Faculty of Science and Technology University of the Basque Country Leioa SpainDepartment of Computer Science and Artificial Intelligence, Faculty of Computer Science University of the Basque Country UPV/EHU San Sebastián SpainAbstract For population genetic studies in nonmodel organisms, it is important to use every single source of genomic information. This paper presents EXFI, a Python pipeline that predicts the splice graph and exon sequences using an assembled transcriptome and raw whole‐genome sequencing reads. The main algorithm uses Bloom filters to remove reads that are not part of the transcriptome, to predict the intron–exon boundaries, to then proceed to call exons from the assembly, and to generate the underlying splice graph. The results are returned in GFA1 format, which encodes both the predicted exon sequences and how they are connected to form transcripts. EXFI is written in Python, tested on Linux platforms, and the source code is available under the MIT License at https://github.com/jlanga/exfi.https://doi.org/10.1002/ece3.6587exome sequencingexonsequence captureSNP discoverysplice graphtranscriptome |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Jorge Langa Andone Estonba Darrell Conklin |
spellingShingle |
Jorge Langa Andone Estonba Darrell Conklin EXFI: Exon and splice graph prediction without a reference genome Ecology and Evolution exome sequencing exon sequence capture SNP discovery splice graph transcriptome |
author_facet |
Jorge Langa Andone Estonba Darrell Conklin |
author_sort |
Jorge Langa |
title |
EXFI: Exon and splice graph prediction without a reference genome |
title_short |
EXFI: Exon and splice graph prediction without a reference genome |
title_full |
EXFI: Exon and splice graph prediction without a reference genome |
title_fullStr |
EXFI: Exon and splice graph prediction without a reference genome |
title_full_unstemmed |
EXFI: Exon and splice graph prediction without a reference genome |
title_sort |
exfi: exon and splice graph prediction without a reference genome |
publisher |
Wiley |
series |
Ecology and Evolution |
issn |
2045-7758 |
publishDate |
2020-08-01 |
description |
Abstract For population genetic studies in nonmodel organisms, it is important to use every single source of genomic information. This paper presents EXFI, a Python pipeline that predicts the splice graph and exon sequences using an assembled transcriptome and raw whole‐genome sequencing reads. The main algorithm uses Bloom filters to remove reads that are not part of the transcriptome, to predict the intron–exon boundaries, to then proceed to call exons from the assembly, and to generate the underlying splice graph. The results are returned in GFA1 format, which encodes both the predicted exon sequences and how they are connected to form transcripts. EXFI is written in Python, tested on Linux platforms, and the source code is available under the MIT License at https://github.com/jlanga/exfi. |
topic |
exome sequencing exon sequence capture SNP discovery splice graph transcriptome |
url |
https://doi.org/10.1002/ece3.6587 |
work_keys_str_mv |
AT jorgelanga exfiexonandsplicegraphpredictionwithoutareferencegenome AT andoneestonba exfiexonandsplicegraphpredictionwithoutareferencegenome AT darrellconklin exfiexonandsplicegraphpredictionwithoutareferencegenome |
_version_ |
1724169283669327872 |