Toward genomic selection in Pinus taeda: Integrating resources to support array design in a complex conifer genome

Premise An informatics approach was used for the construction of an Axiom genotyping array from heterogeneous, high‐throughput sequence data to assess the complex genome of loblolly pine (Pinus taeda). Methods High‐throughput sequence data, sourced from exome capture and whole genome reduced‐represe...

Full description

Bibliographic Details
Main Authors: Madison Caballero, Edwin Lauer, Jeremy Bennett, Sumaira Zaman, Susan McEvoy, Juan Acosta, Colin Jackson, Laura Townsend, Andrew Eckert, Ross W. Whetten, Carol Loopstra, Jason Holliday, Mihir Mandal, Jill L. Wegrzyn, Fikret Isik
Format: Article
Language:English
Published: Wiley 2021-06-01
Series:Applications in Plant Sciences
Subjects:
Online Access:https://doi.org/10.1002/aps3.11439
Description
Summary:Premise An informatics approach was used for the construction of an Axiom genotyping array from heterogeneous, high‐throughput sequence data to assess the complex genome of loblolly pine (Pinus taeda). Methods High‐throughput sequence data, sourced from exome capture and whole genome reduced‐representation approaches from 2698 trees across five sequence populations, were analyzed with the improved genome assembly and annotation for the loblolly pine. A variant detection, filtering, and probe design pipeline was developed to detect true variants across and within populations. From 8.27 million variants, a total of 642,275 were evaluated and 423,695 of those were screened across a range‐wide population. Results The final informatics and screening approach delivered an Axiom array representing 46,439 high‐confidence variants to the forest tree breeding and genetics community. Based on the annotated reference genome, 34% were located in or directly upstream or downstream of genic regions. Discussion The Pita50K array represents a genome‐wide resource developed from sequence data for an economically important conifer, loblolly pine. It uniquely integrates independent projects that assessed trees sampled across the native range. The challenges associated with the large and repetitive genome are addressed in the development of this resource.
ISSN:2168-0450