Large-Scale Discovery of Gene-Enriched SNPs
Whole-genome association studies of complex traits in higher eukaryotes require a high density of single nucleotide polymorphism (SNP) markers at genome-wide coverage. To design high-throughput, multiplexed SNP genotyping assays, researchers must first discover large numbers of SNPs by extensively r...
Main Authors: | , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2009-07-01
|
Series: | The Plant Genome |
Online Access: | https://dl.sciencesocieties.org/publications/tpg/articles/2/2/121 |
id |
doaj-e55e47fbcf054076a09a586d81f073f3 |
---|---|
record_format |
Article |
spelling |
doaj-e55e47fbcf054076a09a586d81f073f32020-11-25T03:48:04ZengWileyThe Plant Genome1940-33722009-07-012212113310.3835/plantgenome2009.01.0002121Large-Scale Discovery of Gene-Enriched SNPsMichael A. GoreMark H. WrightElhan S. ErsozPascal BouffardEdward S. SzekeresThomas P. JarvieBonnie L. HurwitzApurva NarechaniaTimothy T. HarkinsGeorge S. GrillsDoreen H. WareEdward S. BucklerWhole-genome association studies of complex traits in higher eukaryotes require a high density of single nucleotide polymorphism (SNP) markers at genome-wide coverage. To design high-throughput, multiplexed SNP genotyping assays, researchers must first discover large numbers of SNPs by extensively resequencing multiple individuals or lines. For SNP discovery approaches using short read-lengths that next-generation DNA sequencing technologies offer, the highly repetitive and duplicated nature of large plant genomes presents additional challenges. Here, we describe a genomic library construction procedure that facilitates pyrosequencing of genic and low-copy regions in plant genomes, and a customized computational pipeline to analyze and assemble short reads (100–200 bp), identify allelic reference sequence comparisons, and call SNPs with a high degree of accuracy. With maize ( L.) as the test organism in a pilot experiment, the implementation of these methods resulted in the identification of 126,683 putative SNPs between two maize inbred lines at an estimated false discovery rate (FDR) of 15.1%. We estimated rates of false SNP discovery using an internal control, and we validated these FDR rates with an external SNP dataset that was generated using locus-specific PCR amplification and Sanger sequencing. These results show that this approach has wide applicability for efficiently and accurately detecting gene-enriched SNPs in large, complex plant genomes.https://dl.sciencesocieties.org/publications/tpg/articles/2/2/121 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Michael A. Gore Mark H. Wright Elhan S. Ersoz Pascal Bouffard Edward S. Szekeres Thomas P. Jarvie Bonnie L. Hurwitz Apurva Narechania Timothy T. Harkins George S. Grills Doreen H. Ware Edward S. Buckler |
spellingShingle |
Michael A. Gore Mark H. Wright Elhan S. Ersoz Pascal Bouffard Edward S. Szekeres Thomas P. Jarvie Bonnie L. Hurwitz Apurva Narechania Timothy T. Harkins George S. Grills Doreen H. Ware Edward S. Buckler Large-Scale Discovery of Gene-Enriched SNPs The Plant Genome |
author_facet |
Michael A. Gore Mark H. Wright Elhan S. Ersoz Pascal Bouffard Edward S. Szekeres Thomas P. Jarvie Bonnie L. Hurwitz Apurva Narechania Timothy T. Harkins George S. Grills Doreen H. Ware Edward S. Buckler |
author_sort |
Michael A. Gore |
title |
Large-Scale Discovery of Gene-Enriched SNPs |
title_short |
Large-Scale Discovery of Gene-Enriched SNPs |
title_full |
Large-Scale Discovery of Gene-Enriched SNPs |
title_fullStr |
Large-Scale Discovery of Gene-Enriched SNPs |
title_full_unstemmed |
Large-Scale Discovery of Gene-Enriched SNPs |
title_sort |
large-scale discovery of gene-enriched snps |
publisher |
Wiley |
series |
The Plant Genome |
issn |
1940-3372 |
publishDate |
2009-07-01 |
description |
Whole-genome association studies of complex traits in higher eukaryotes require a high density of single nucleotide polymorphism (SNP) markers at genome-wide coverage. To design high-throughput, multiplexed SNP genotyping assays, researchers must first discover large numbers of SNPs by extensively resequencing multiple individuals or lines. For SNP discovery approaches using short read-lengths that next-generation DNA sequencing technologies offer, the highly repetitive and duplicated nature of large plant genomes presents additional challenges. Here, we describe a genomic library construction procedure that facilitates pyrosequencing of genic and low-copy regions in plant genomes, and a customized computational pipeline to analyze and assemble short reads (100–200 bp), identify allelic reference sequence comparisons, and call SNPs with a high degree of accuracy. With maize ( L.) as the test organism in a pilot experiment, the implementation of these methods resulted in the identification of 126,683 putative SNPs between two maize inbred lines at an estimated false discovery rate (FDR) of 15.1%. We estimated rates of false SNP discovery using an internal control, and we validated these FDR rates with an external SNP dataset that was generated using locus-specific PCR amplification and Sanger sequencing. These results show that this approach has wide applicability for efficiently and accurately detecting gene-enriched SNPs in large, complex plant genomes. |
url |
https://dl.sciencesocieties.org/publications/tpg/articles/2/2/121 |
work_keys_str_mv |
AT michaelagore largescalediscoveryofgeneenrichedsnps AT markhwright largescalediscoveryofgeneenrichedsnps AT elhansersoz largescalediscoveryofgeneenrichedsnps AT pascalbouffard largescalediscoveryofgeneenrichedsnps AT edwardsszekeres largescalediscoveryofgeneenrichedsnps AT thomaspjarvie largescalediscoveryofgeneenrichedsnps AT bonnielhurwitz largescalediscoveryofgeneenrichedsnps AT apurvanarechania largescalediscoveryofgeneenrichedsnps AT timothytharkins largescalediscoveryofgeneenrichedsnps AT georgesgrills largescalediscoveryofgeneenrichedsnps AT doreenhware largescalediscoveryofgeneenrichedsnps AT edwardsbuckler largescalediscoveryofgeneenrichedsnps |
_version_ |
1724500381182984192 |