Cyberinfrastructure and resources to enable an integrative approach to studying forest trees

Abstract Sequencing technologies and bioinformatic approaches are now available to resolve the challenges associated with complex and heterozygous genomes. Increased access to less expensive and more effective instrumentation will contribute to a wealth of high‐quality plant genomes in the next few...

Full description

Bibliographic Details
Main Authors: Jill L. Wegrzyn, Taylor Falk, Emily Grau, Sean Buehler, Risharde Ramnath, Nic Herndon
Format: Article
Language:English
Published: Wiley 2020-01-01
Series:Evolutionary Applications
Subjects:
Online Access:https://doi.org/10.1111/eva.12860
id doaj-3f5b7fa1bb174410a6f8268ce9315a60
record_format Article
spelling doaj-3f5b7fa1bb174410a6f8268ce9315a602020-11-25T02:56:31ZengWileyEvolutionary Applications1752-45712020-01-0113122824110.1111/eva.12860Cyberinfrastructure and resources to enable an integrative approach to studying forest treesJill L. Wegrzyn0Taylor Falk1Emily Grau2Sean Buehler3Risharde Ramnath4Nic Herndon5Department of Ecology and Evolutionary Biology University of Connecticut Storrs ConnecticutDepartment of Ecology and Evolutionary Biology University of Connecticut Storrs ConnecticutDepartment of Ecology and Evolutionary Biology University of Connecticut Storrs ConnecticutDepartment of Ecology and Evolutionary Biology University of Connecticut Storrs ConnecticutDepartment of Ecology and Evolutionary Biology University of Connecticut Storrs ConnecticutDepartment of Ecology and Evolutionary Biology University of Connecticut Storrs ConnecticutAbstract Sequencing technologies and bioinformatic approaches are now available to resolve the challenges associated with complex and heterozygous genomes. Increased access to less expensive and more effective instrumentation will contribute to a wealth of high‐quality plant genomes in the next few years. In the meantime, more than 370 tree species are associated with public projects in primary repositories that are interrogating expression profiles, identifying variants, or analyzing targeted capture without a high‐quality reference genome. Genomic data from these projects generates sequences that represent intermediate assemblies for transcriptomes and genomes. These data contribute to forest tree biology, but the associated sequence remains trapped in supplemental files that are poorly integrated in plant community databases and comparative genomic platforms. Successful implementation of life science cyberinfrastructure is improving data standards, ontologies, analytic workflows, and integrated database platforms for both model and non‐model plant species. Unique to forest trees with large populations that are long‐lived, outcrossing, and genetically diverse, the phenotypic and environmental metrics associated with georeferenced populations are just as important as the genomic data sampled for each individual. To address questions related to forest health and productivity, cyberinfrastructure must keep pace with the magnitude of genomic and phenomic sampling of larger populations. This review examines the current landscape of cyberinfrastructure, with an emphasis on best practices and resources to align community data with the Findable, Accessible, Interoperable, and Reusable (FAIR) guidelines.https://doi.org/10.1111/eva.12860cyberinfrastructureFAIRphenomicsplant ontologiespopulation geneticstree databases
collection DOAJ
language English
format Article
sources DOAJ
author Jill L. Wegrzyn
Taylor Falk
Emily Grau
Sean Buehler
Risharde Ramnath
Nic Herndon
spellingShingle Jill L. Wegrzyn
Taylor Falk
Emily Grau
Sean Buehler
Risharde Ramnath
Nic Herndon
Cyberinfrastructure and resources to enable an integrative approach to studying forest trees
Evolutionary Applications
cyberinfrastructure
FAIR
phenomics
plant ontologies
population genetics
tree databases
author_facet Jill L. Wegrzyn
Taylor Falk
Emily Grau
Sean Buehler
Risharde Ramnath
Nic Herndon
author_sort Jill L. Wegrzyn
title Cyberinfrastructure and resources to enable an integrative approach to studying forest trees
title_short Cyberinfrastructure and resources to enable an integrative approach to studying forest trees
title_full Cyberinfrastructure and resources to enable an integrative approach to studying forest trees
title_fullStr Cyberinfrastructure and resources to enable an integrative approach to studying forest trees
title_full_unstemmed Cyberinfrastructure and resources to enable an integrative approach to studying forest trees
title_sort cyberinfrastructure and resources to enable an integrative approach to studying forest trees
publisher Wiley
series Evolutionary Applications
issn 1752-4571
publishDate 2020-01-01
description Abstract Sequencing technologies and bioinformatic approaches are now available to resolve the challenges associated with complex and heterozygous genomes. Increased access to less expensive and more effective instrumentation will contribute to a wealth of high‐quality plant genomes in the next few years. In the meantime, more than 370 tree species are associated with public projects in primary repositories that are interrogating expression profiles, identifying variants, or analyzing targeted capture without a high‐quality reference genome. Genomic data from these projects generates sequences that represent intermediate assemblies for transcriptomes and genomes. These data contribute to forest tree biology, but the associated sequence remains trapped in supplemental files that are poorly integrated in plant community databases and comparative genomic platforms. Successful implementation of life science cyberinfrastructure is improving data standards, ontologies, analytic workflows, and integrated database platforms for both model and non‐model plant species. Unique to forest trees with large populations that are long‐lived, outcrossing, and genetically diverse, the phenotypic and environmental metrics associated with georeferenced populations are just as important as the genomic data sampled for each individual. To address questions related to forest health and productivity, cyberinfrastructure must keep pace with the magnitude of genomic and phenomic sampling of larger populations. This review examines the current landscape of cyberinfrastructure, with an emphasis on best practices and resources to align community data with the Findable, Accessible, Interoperable, and Reusable (FAIR) guidelines.
topic cyberinfrastructure
FAIR
phenomics
plant ontologies
population genetics
tree databases
url https://doi.org/10.1111/eva.12860
work_keys_str_mv AT jilllwegrzyn cyberinfrastructureandresourcestoenableanintegrativeapproachtostudyingforesttrees
AT taylorfalk cyberinfrastructureandresourcestoenableanintegrativeapproachtostudyingforesttrees
AT emilygrau cyberinfrastructureandresourcestoenableanintegrativeapproachtostudyingforesttrees
AT seanbuehler cyberinfrastructureandresourcestoenableanintegrativeapproachtostudyingforesttrees
AT risharderamnath cyberinfrastructureandresourcestoenableanintegrativeapproachtostudyingforesttrees
AT nicherndon cyberinfrastructureandresourcestoenableanintegrativeapproachtostudyingforesttrees
_version_ 1724713603838246912