Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae

Characterization and identification of recombination hotspots provide important insights into the mechanism of recombination and genome evolution. In contrast with existing sequence-based models for predicting recombination hotspots which were defined in a ORF-based manner, here, we first defined re...

Full description

Bibliographic Details
Main Authors: Guoqing Liu, Shuangjian Song, Qiguo Zhang, Biyu Dong, Yu Sun, Guojun Liu, Xiujuan Zhao
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-06-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fgene.2021.705038/full
id doaj-5f33f0d066084f90bdfce71cfa94ce20
record_format Article
spelling doaj-5f33f0d066084f90bdfce71cfa94ce202021-06-29T15:09:02ZengFrontiers Media S.A.Frontiers in Genetics1664-80212021-06-011210.3389/fgene.2021.705038705038Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiaeGuoqing Liu0Guoqing Liu1Shuangjian Song2Qiguo Zhang3Biyu Dong4Yu Sun5Guojun Liu6Guojun Liu7Xiujuan Zhao8Xiujuan Zhao9School of Life Sciences and Technology, Inner Mongolia University of Science and Technology, Baotou, ChinaInner Mongolia Key Laboratory of Functional Genomics and Bioinformatics, Inner Mongolia University of Science and Technology, Baotou, ChinaSchool of Life Sciences and Technology, Inner Mongolia University of Science and Technology, Baotou, ChinaSchool of Life Sciences and Technology, Inner Mongolia University of Science and Technology, Baotou, ChinaSchool of Life Sciences and Technology, Inner Mongolia University of Science and Technology, Baotou, ChinaSchool of Life Sciences, Inner Mongolia University, Hohhot, ChinaSchool of Life Sciences and Technology, Inner Mongolia University of Science and Technology, Baotou, ChinaInner Mongolia Key Laboratory of Functional Genomics and Bioinformatics, Inner Mongolia University of Science and Technology, Baotou, ChinaSchool of Life Sciences and Technology, Inner Mongolia University of Science and Technology, Baotou, ChinaInner Mongolia Key Laboratory of Functional Genomics and Bioinformatics, Inner Mongolia University of Science and Technology, Baotou, ChinaCharacterization and identification of recombination hotspots provide important insights into the mechanism of recombination and genome evolution. In contrast with existing sequence-based models for predicting recombination hotspots which were defined in a ORF-based manner, here, we first defined recombination hot/cold spots based on public high-resolution Spo11-oligo-seq data, then characterized them in terms of DNA sequence and epigenetic marks, and finally presented classifiers to identify hotspots. We found that, in addition to some previously discovered DNA-based features like GC-skew, recombination hotspots in yeast can also be characterized by some remarkable features associated with DNA physical properties and shape. More importantly, by using DNA-based features and several epigenetic marks, we built several classifiers to discriminate hotspots from coldspots, and found that SVM classifier performs the best with an accuracy of ∼92%, which is also the highest among the models in comparison. Feature importance analysis combined with prediction results show that epigenetic marks and variation of sequence-based features along the hotspots contribute dominantly to hotspot identification. By using incremental feature selection method, an optimal feature subset that consists of much less features was obtained without sacrificing prediction accuracy.https://www.frontiersin.org/articles/10.3389/fgene.2021.705038/fullrecombination hotspotsDNA physical propertyclassifierepigenetic markoptimal feature set
collection DOAJ
language English
format Article
sources DOAJ
author Guoqing Liu
Guoqing Liu
Shuangjian Song
Qiguo Zhang
Biyu Dong
Yu Sun
Guojun Liu
Guojun Liu
Xiujuan Zhao
Xiujuan Zhao
spellingShingle Guoqing Liu
Guoqing Liu
Shuangjian Song
Qiguo Zhang
Biyu Dong
Yu Sun
Guojun Liu
Guojun Liu
Xiujuan Zhao
Xiujuan Zhao
Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae
Frontiers in Genetics
recombination hotspots
DNA physical property
classifier
epigenetic mark
optimal feature set
author_facet Guoqing Liu
Guoqing Liu
Shuangjian Song
Qiguo Zhang
Biyu Dong
Yu Sun
Guojun Liu
Guojun Liu
Xiujuan Zhao
Xiujuan Zhao
author_sort Guoqing Liu
title Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae
title_short Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae
title_full Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae
title_fullStr Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae
title_full_unstemmed Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae
title_sort epigenetic marks and variation of sequence-based information along genomic regions are predictive of recombination hot/cold spots in saccharomyces cerevisiae
publisher Frontiers Media S.A.
series Frontiers in Genetics
issn 1664-8021
publishDate 2021-06-01
description Characterization and identification of recombination hotspots provide important insights into the mechanism of recombination and genome evolution. In contrast with existing sequence-based models for predicting recombination hotspots which were defined in a ORF-based manner, here, we first defined recombination hot/cold spots based on public high-resolution Spo11-oligo-seq data, then characterized them in terms of DNA sequence and epigenetic marks, and finally presented classifiers to identify hotspots. We found that, in addition to some previously discovered DNA-based features like GC-skew, recombination hotspots in yeast can also be characterized by some remarkable features associated with DNA physical properties and shape. More importantly, by using DNA-based features and several epigenetic marks, we built several classifiers to discriminate hotspots from coldspots, and found that SVM classifier performs the best with an accuracy of ∼92%, which is also the highest among the models in comparison. Feature importance analysis combined with prediction results show that epigenetic marks and variation of sequence-based features along the hotspots contribute dominantly to hotspot identification. By using incremental feature selection method, an optimal feature subset that consists of much less features was obtained without sacrificing prediction accuracy.
topic recombination hotspots
DNA physical property
classifier
epigenetic mark
optimal feature set
url https://www.frontiersin.org/articles/10.3389/fgene.2021.705038/full
work_keys_str_mv AT guoqingliu epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae
AT guoqingliu epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae
AT shuangjiansong epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae
AT qiguozhang epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae
AT biyudong epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae
AT yusun epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae
AT guojunliu epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae
AT guojunliu epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae
AT xiujuanzhao epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae
AT xiujuanzhao epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae
_version_ 1721354715508244480