GFusion: an Effective Algorithm to Identify Fusion Genes from Cancer RNA-Seq Data
Abstract Fusion gene derived from genomic rearrangement plays a key role in cancer initiation. The discovery of novel gene fusions may be of significant importance in cancer diagnosis and treatment. Meanwhile, next generation sequencing technology provide a sensitive and efficient way to identify ge...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Publishing Group
2017-07-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-017-07070-6 |
id |
doaj-a6ce249b08754520b1b3408c7789b6e9 |
---|---|
record_format |
Article |
spelling |
doaj-a6ce249b08754520b1b3408c7789b6e92020-12-08T02:30:50ZengNature Publishing GroupScientific Reports2045-23222017-07-017111210.1038/s41598-017-07070-6GFusion: an Effective Algorithm to Identify Fusion Genes from Cancer RNA-Seq DataJian Zhao0Qi Chen1Jing Wu2Ping Han3Xiaofeng Song4Department of Biomedical Engineering, Nanjing University of Aeronautics and AstronauticsDepartment of Biomedical Engineering, Nanjing University of Aeronautics and AstronauticsDepartment of Biomedical Engineering, Nanjing University of Aeronautics and AstronauticsDepartment of Gynecology and Obstetrics, The First Affiliated Hospital with Nanjing Medical UniversityDepartment of Biomedical Engineering, Nanjing University of Aeronautics and AstronauticsAbstract Fusion gene derived from genomic rearrangement plays a key role in cancer initiation. The discovery of novel gene fusions may be of significant importance in cancer diagnosis and treatment. Meanwhile, next generation sequencing technology provide a sensitive and efficient way to identify gene fusions in genomic levels. However, there are still many challenges and limitations remaining in the existing methods which only rely on unmapped reads or discordant alignment fragments. In this work we have developed GFusion, a novel method using RNA-Seq data, to identify the fusion genes. This pipeline performs multiple alignments and strict filtering algorithm to improve sensitivity and reduce the false positive rate. GFusion successfully detected 34 from 43 previously reported fusions in four cancer datasets. We also demonstrated the effectiveness of GFusion using 24 million 76 bp paired-end reads simulation data which contains 42 artificial fusion genes, among which GFusion successfully discovered 37 fusion genes. Compared with existing methods, GFusion presented higher sensitivity and lower false positive rate. The GFusion pipeline can be accessed freely for non-commercial purposes at: https://github.com/xiaofengsong/GFusion .https://doi.org/10.1038/s41598-017-07070-6 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Jian Zhao Qi Chen Jing Wu Ping Han Xiaofeng Song |
spellingShingle |
Jian Zhao Qi Chen Jing Wu Ping Han Xiaofeng Song GFusion: an Effective Algorithm to Identify Fusion Genes from Cancer RNA-Seq Data Scientific Reports |
author_facet |
Jian Zhao Qi Chen Jing Wu Ping Han Xiaofeng Song |
author_sort |
Jian Zhao |
title |
GFusion: an Effective Algorithm to Identify Fusion Genes from Cancer RNA-Seq Data |
title_short |
GFusion: an Effective Algorithm to Identify Fusion Genes from Cancer RNA-Seq Data |
title_full |
GFusion: an Effective Algorithm to Identify Fusion Genes from Cancer RNA-Seq Data |
title_fullStr |
GFusion: an Effective Algorithm to Identify Fusion Genes from Cancer RNA-Seq Data |
title_full_unstemmed |
GFusion: an Effective Algorithm to Identify Fusion Genes from Cancer RNA-Seq Data |
title_sort |
gfusion: an effective algorithm to identify fusion genes from cancer rna-seq data |
publisher |
Nature Publishing Group |
series |
Scientific Reports |
issn |
2045-2322 |
publishDate |
2017-07-01 |
description |
Abstract Fusion gene derived from genomic rearrangement plays a key role in cancer initiation. The discovery of novel gene fusions may be of significant importance in cancer diagnosis and treatment. Meanwhile, next generation sequencing technology provide a sensitive and efficient way to identify gene fusions in genomic levels. However, there are still many challenges and limitations remaining in the existing methods which only rely on unmapped reads or discordant alignment fragments. In this work we have developed GFusion, a novel method using RNA-Seq data, to identify the fusion genes. This pipeline performs multiple alignments and strict filtering algorithm to improve sensitivity and reduce the false positive rate. GFusion successfully detected 34 from 43 previously reported fusions in four cancer datasets. We also demonstrated the effectiveness of GFusion using 24 million 76 bp paired-end reads simulation data which contains 42 artificial fusion genes, among which GFusion successfully discovered 37 fusion genes. Compared with existing methods, GFusion presented higher sensitivity and lower false positive rate. The GFusion pipeline can be accessed freely for non-commercial purposes at: https://github.com/xiaofengsong/GFusion . |
url |
https://doi.org/10.1038/s41598-017-07070-6 |
work_keys_str_mv |
AT jianzhao gfusionaneffectivealgorithmtoidentifyfusiongenesfromcancerrnaseqdata AT qichen gfusionaneffectivealgorithmtoidentifyfusiongenesfromcancerrnaseqdata AT jingwu gfusionaneffectivealgorithmtoidentifyfusiongenesfromcancerrnaseqdata AT pinghan gfusionaneffectivealgorithmtoidentifyfusiongenesfromcancerrnaseqdata AT xiaofengsong gfusionaneffectivealgorithmtoidentifyfusiongenesfromcancerrnaseqdata |
_version_ |
1724393628423421952 |