Comprehensive genomic analysis reveals dynamic evolution of endogenous retroviruses that code for retroviral-like protein domains

Abstract Background Endogenous retroviruses (ERVs) are remnants of ancient retroviral infections of mammalian germline cells. A large proportion of ERVs lose their open reading frames (ORFs), while others retain them and become exapted by the host species. However, it remains unclear what proportion...

Full description

Bibliographic Details
Main Authors: Mahoko Takahashi Ueda, Kirill Kryukov, Satomi Mitsuhashi, Hiroaki Mitsuhashi, Tadashi Imanishi, So Nakagawa
Format: Article
Language:English
Published: BMC 2020-09-01
Series:Mobile DNA
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13100-020-00224-w
id doaj-8d8e6b315fe24d90bbb4727e6a1424de
record_format Article
spelling doaj-8d8e6b315fe24d90bbb4727e6a1424de2020-11-25T01:29:01ZengBMCMobile DNA1759-87532020-09-0111111710.1186/s13100-020-00224-wComprehensive genomic analysis reveals dynamic evolution of endogenous retroviruses that code for retroviral-like protein domainsMahoko Takahashi Ueda0Kirill Kryukov1Satomi Mitsuhashi2Hiroaki Mitsuhashi3Tadashi Imanishi4So Nakagawa5Department of Molecular Life Science, Tokai University School of MedicineDepartment of Molecular Life Science, Tokai University School of MedicineDepartment of Human Genetics, Yokohama City University Graduate School of MedicineMicro/Nano Technology Center, Tokai UniversityDepartment of Molecular Life Science, Tokai University School of MedicineDepartment of Molecular Life Science, Tokai University School of MedicineAbstract Background Endogenous retroviruses (ERVs) are remnants of ancient retroviral infections of mammalian germline cells. A large proportion of ERVs lose their open reading frames (ORFs), while others retain them and become exapted by the host species. However, it remains unclear what proportion of ERVs possess ORFs (ERV-ORFs), become transcribed, and serve as candidates for co-opted genes. Results We investigated characteristics of 176,401 ERV-ORFs containing retroviral-like protein domains (gag, pro, pol, and env) in 19 mammalian genomes. The fractions of ERVs possessing ORFs were overall small (~ 0.15%) although they varied depending on domain types as well as species. The observed divergence of ERV-ORF from their consensus sequences showed bimodal distributions, suggesting that a large proportion of ERV-ORFs either recently, or anciently, inserted themselves into mammalian genomes. Alternatively, very few ERVs lacking ORFs were found to exhibit similar divergence patterns. To identify candidates for ERV-derived genes, we estimated the ratio of non-synonymous to synonymous substitution rates (dN/dS) for ERV-ORFs in human and non-human mammalian pairs, and found that approximately 42% of the ERV-ORFs showed dN/dS < 1. Further, using functional genomics data including transcriptome sequencing, we determined that approximately 9.7% of these selected ERV-ORFs exhibited transcriptional potential. Conclusions These results suggest that purifying selection operates on a certain portion of ERV-ORFs, some of which may correspond to uncharacterized functional genes hidden within mammalian genomes. Together, our analyses suggest that more ERV-ORFs may be co-opted in a host-species specific manner than we currently know, which are likely to have contributed to mammalian evolution and diversification.http://link.springer.com/article/10.1186/s13100-020-00224-wEndogenous retrovirusRetroviral-like protein domainOpen reading frameEvolutionDivergence patternCo-option
collection DOAJ
language English
format Article
sources DOAJ
author Mahoko Takahashi Ueda
Kirill Kryukov
Satomi Mitsuhashi
Hiroaki Mitsuhashi
Tadashi Imanishi
So Nakagawa
spellingShingle Mahoko Takahashi Ueda
Kirill Kryukov
Satomi Mitsuhashi
Hiroaki Mitsuhashi
Tadashi Imanishi
So Nakagawa
Comprehensive genomic analysis reveals dynamic evolution of endogenous retroviruses that code for retroviral-like protein domains
Mobile DNA
Endogenous retrovirus
Retroviral-like protein domain
Open reading frame
Evolution
Divergence pattern
Co-option
author_facet Mahoko Takahashi Ueda
Kirill Kryukov
Satomi Mitsuhashi
Hiroaki Mitsuhashi
Tadashi Imanishi
So Nakagawa
author_sort Mahoko Takahashi Ueda
title Comprehensive genomic analysis reveals dynamic evolution of endogenous retroviruses that code for retroviral-like protein domains
title_short Comprehensive genomic analysis reveals dynamic evolution of endogenous retroviruses that code for retroviral-like protein domains
title_full Comprehensive genomic analysis reveals dynamic evolution of endogenous retroviruses that code for retroviral-like protein domains
title_fullStr Comprehensive genomic analysis reveals dynamic evolution of endogenous retroviruses that code for retroviral-like protein domains
title_full_unstemmed Comprehensive genomic analysis reveals dynamic evolution of endogenous retroviruses that code for retroviral-like protein domains
title_sort comprehensive genomic analysis reveals dynamic evolution of endogenous retroviruses that code for retroviral-like protein domains
publisher BMC
series Mobile DNA
issn 1759-8753
publishDate 2020-09-01
description Abstract Background Endogenous retroviruses (ERVs) are remnants of ancient retroviral infections of mammalian germline cells. A large proportion of ERVs lose their open reading frames (ORFs), while others retain them and become exapted by the host species. However, it remains unclear what proportion of ERVs possess ORFs (ERV-ORFs), become transcribed, and serve as candidates for co-opted genes. Results We investigated characteristics of 176,401 ERV-ORFs containing retroviral-like protein domains (gag, pro, pol, and env) in 19 mammalian genomes. The fractions of ERVs possessing ORFs were overall small (~ 0.15%) although they varied depending on domain types as well as species. The observed divergence of ERV-ORF from their consensus sequences showed bimodal distributions, suggesting that a large proportion of ERV-ORFs either recently, or anciently, inserted themselves into mammalian genomes. Alternatively, very few ERVs lacking ORFs were found to exhibit similar divergence patterns. To identify candidates for ERV-derived genes, we estimated the ratio of non-synonymous to synonymous substitution rates (dN/dS) for ERV-ORFs in human and non-human mammalian pairs, and found that approximately 42% of the ERV-ORFs showed dN/dS < 1. Further, using functional genomics data including transcriptome sequencing, we determined that approximately 9.7% of these selected ERV-ORFs exhibited transcriptional potential. Conclusions These results suggest that purifying selection operates on a certain portion of ERV-ORFs, some of which may correspond to uncharacterized functional genes hidden within mammalian genomes. Together, our analyses suggest that more ERV-ORFs may be co-opted in a host-species specific manner than we currently know, which are likely to have contributed to mammalian evolution and diversification.
topic Endogenous retrovirus
Retroviral-like protein domain
Open reading frame
Evolution
Divergence pattern
Co-option
url http://link.springer.com/article/10.1186/s13100-020-00224-w
work_keys_str_mv AT mahokotakahashiueda comprehensivegenomicanalysisrevealsdynamicevolutionofendogenousretrovirusesthatcodeforretrovirallikeproteindomains
AT kirillkryukov comprehensivegenomicanalysisrevealsdynamicevolutionofendogenousretrovirusesthatcodeforretrovirallikeproteindomains
AT satomimitsuhashi comprehensivegenomicanalysisrevealsdynamicevolutionofendogenousretrovirusesthatcodeforretrovirallikeproteindomains
AT hiroakimitsuhashi comprehensivegenomicanalysisrevealsdynamicevolutionofendogenousretrovirusesthatcodeforretrovirallikeproteindomains
AT tadashiimanishi comprehensivegenomicanalysisrevealsdynamicevolutionofendogenousretrovirusesthatcodeforretrovirallikeproteindomains
AT sonakagawa comprehensivegenomicanalysisrevealsdynamicevolutionofendogenousretrovirusesthatcodeforretrovirallikeproteindomains
_version_ 1725099037477044224