Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing.

Whole-genome sequencing is a powerful tool for analyzing genetic variation on a global scale. One particularly useful application is the identification of mutations obtained by classical phenotypic screens in model species. Sequence data from the mutant strain is aligned to the reference genome, and...

Full description

Bibliographic Details
Main Authors: Harold E Smith, Sijung Yun
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2017-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC5363872?pdf=render
id doaj-b59222f36f484807bfb3248f0b6fbbc8
record_format Article
spelling doaj-b59222f36f484807bfb3248f0b6fbbc82020-11-25T02:23:08ZengPublic Library of Science (PLoS)PLoS ONE1932-62032017-01-01123e017444610.1371/journal.pone.0174446Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing.Harold E SmithSijung YunWhole-genome sequencing is a powerful tool for analyzing genetic variation on a global scale. One particularly useful application is the identification of mutations obtained by classical phenotypic screens in model species. Sequence data from the mutant strain is aligned to the reference genome, and then variants are called to generate a list of candidate alleles. A number of software pipelines for mutation identification have been targeted to C. elegans, with particular emphasis on ease of use, incorporation of mapping strain data, subtraction of background variants, and similar criteria. Although success is predicated upon the sensitive and accurate detection of candidate alleles, relatively little effort has been invested in evaluating the underlying software components that are required for mutation identification. Therefore, we have benchmarked a number of commonly used tools for sequence alignment and variant calling, in all pair-wise combinations, against both simulated and actual datasets. We compared the accuracy of those pipelines for mutation identification in C. elegans, and found that the combination of BBMap for alignment plus FreeBayes for variant calling offers the most robust performance.http://europepmc.org/articles/PMC5363872?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Harold E Smith
Sijung Yun
spellingShingle Harold E Smith
Sijung Yun
Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing.
PLoS ONE
author_facet Harold E Smith
Sijung Yun
author_sort Harold E Smith
title Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing.
title_short Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing.
title_full Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing.
title_fullStr Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing.
title_full_unstemmed Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing.
title_sort evaluating alignment and variant-calling software for mutation identification in c. elegans by whole-genome sequencing.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2017-01-01
description Whole-genome sequencing is a powerful tool for analyzing genetic variation on a global scale. One particularly useful application is the identification of mutations obtained by classical phenotypic screens in model species. Sequence data from the mutant strain is aligned to the reference genome, and then variants are called to generate a list of candidate alleles. A number of software pipelines for mutation identification have been targeted to C. elegans, with particular emphasis on ease of use, incorporation of mapping strain data, subtraction of background variants, and similar criteria. Although success is predicated upon the sensitive and accurate detection of candidate alleles, relatively little effort has been invested in evaluating the underlying software components that are required for mutation identification. Therefore, we have benchmarked a number of commonly used tools for sequence alignment and variant calling, in all pair-wise combinations, against both simulated and actual datasets. We compared the accuracy of those pipelines for mutation identification in C. elegans, and found that the combination of BBMap for alignment plus FreeBayes for variant calling offers the most robust performance.
url http://europepmc.org/articles/PMC5363872?pdf=render
work_keys_str_mv AT haroldesmith evaluatingalignmentandvariantcallingsoftwareformutationidentificationincelegansbywholegenomesequencing
AT sijungyun evaluatingalignmentandvariantcallingsoftwareformutationidentificationincelegansbywholegenomesequencing
_version_ 1724859509524922368