Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae
Characterization and identification of recombination hotspots provide important insights into the mechanism of recombination and genome evolution. In contrast with existing sequence-based models for predicting recombination hotspots which were defined in a ORF-based manner, here, we first defined re...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2021-06-01
|
Series: | Frontiers in Genetics |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fgene.2021.705038/full |
id |
doaj-5f33f0d066084f90bdfce71cfa94ce20 |
---|---|
record_format |
Article |
spelling |
doaj-5f33f0d066084f90bdfce71cfa94ce202021-06-29T15:09:02ZengFrontiers Media S.A.Frontiers in Genetics1664-80212021-06-011210.3389/fgene.2021.705038705038Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiaeGuoqing Liu0Guoqing Liu1Shuangjian Song2Qiguo Zhang3Biyu Dong4Yu Sun5Guojun Liu6Guojun Liu7Xiujuan Zhao8Xiujuan Zhao9School of Life Sciences and Technology, Inner Mongolia University of Science and Technology, Baotou, ChinaInner Mongolia Key Laboratory of Functional Genomics and Bioinformatics, Inner Mongolia University of Science and Technology, Baotou, ChinaSchool of Life Sciences and Technology, Inner Mongolia University of Science and Technology, Baotou, ChinaSchool of Life Sciences and Technology, Inner Mongolia University of Science and Technology, Baotou, ChinaSchool of Life Sciences and Technology, Inner Mongolia University of Science and Technology, Baotou, ChinaSchool of Life Sciences, Inner Mongolia University, Hohhot, ChinaSchool of Life Sciences and Technology, Inner Mongolia University of Science and Technology, Baotou, ChinaInner Mongolia Key Laboratory of Functional Genomics and Bioinformatics, Inner Mongolia University of Science and Technology, Baotou, ChinaSchool of Life Sciences and Technology, Inner Mongolia University of Science and Technology, Baotou, ChinaInner Mongolia Key Laboratory of Functional Genomics and Bioinformatics, Inner Mongolia University of Science and Technology, Baotou, ChinaCharacterization and identification of recombination hotspots provide important insights into the mechanism of recombination and genome evolution. In contrast with existing sequence-based models for predicting recombination hotspots which were defined in a ORF-based manner, here, we first defined recombination hot/cold spots based on public high-resolution Spo11-oligo-seq data, then characterized them in terms of DNA sequence and epigenetic marks, and finally presented classifiers to identify hotspots. We found that, in addition to some previously discovered DNA-based features like GC-skew, recombination hotspots in yeast can also be characterized by some remarkable features associated with DNA physical properties and shape. More importantly, by using DNA-based features and several epigenetic marks, we built several classifiers to discriminate hotspots from coldspots, and found that SVM classifier performs the best with an accuracy of ∼92%, which is also the highest among the models in comparison. Feature importance analysis combined with prediction results show that epigenetic marks and variation of sequence-based features along the hotspots contribute dominantly to hotspot identification. By using incremental feature selection method, an optimal feature subset that consists of much less features was obtained without sacrificing prediction accuracy.https://www.frontiersin.org/articles/10.3389/fgene.2021.705038/fullrecombination hotspotsDNA physical propertyclassifierepigenetic markoptimal feature set |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Guoqing Liu Guoqing Liu Shuangjian Song Qiguo Zhang Biyu Dong Yu Sun Guojun Liu Guojun Liu Xiujuan Zhao Xiujuan Zhao |
spellingShingle |
Guoqing Liu Guoqing Liu Shuangjian Song Qiguo Zhang Biyu Dong Yu Sun Guojun Liu Guojun Liu Xiujuan Zhao Xiujuan Zhao Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae Frontiers in Genetics recombination hotspots DNA physical property classifier epigenetic mark optimal feature set |
author_facet |
Guoqing Liu Guoqing Liu Shuangjian Song Qiguo Zhang Biyu Dong Yu Sun Guojun Liu Guojun Liu Xiujuan Zhao Xiujuan Zhao |
author_sort |
Guoqing Liu |
title |
Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae |
title_short |
Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae |
title_full |
Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae |
title_fullStr |
Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae |
title_full_unstemmed |
Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae |
title_sort |
epigenetic marks and variation of sequence-based information along genomic regions are predictive of recombination hot/cold spots in saccharomyces cerevisiae |
publisher |
Frontiers Media S.A. |
series |
Frontiers in Genetics |
issn |
1664-8021 |
publishDate |
2021-06-01 |
description |
Characterization and identification of recombination hotspots provide important insights into the mechanism of recombination and genome evolution. In contrast with existing sequence-based models for predicting recombination hotspots which were defined in a ORF-based manner, here, we first defined recombination hot/cold spots based on public high-resolution Spo11-oligo-seq data, then characterized them in terms of DNA sequence and epigenetic marks, and finally presented classifiers to identify hotspots. We found that, in addition to some previously discovered DNA-based features like GC-skew, recombination hotspots in yeast can also be characterized by some remarkable features associated with DNA physical properties and shape. More importantly, by using DNA-based features and several epigenetic marks, we built several classifiers to discriminate hotspots from coldspots, and found that SVM classifier performs the best with an accuracy of ∼92%, which is also the highest among the models in comparison. Feature importance analysis combined with prediction results show that epigenetic marks and variation of sequence-based features along the hotspots contribute dominantly to hotspot identification. By using incremental feature selection method, an optimal feature subset that consists of much less features was obtained without sacrificing prediction accuracy. |
topic |
recombination hotspots DNA physical property classifier epigenetic mark optimal feature set |
url |
https://www.frontiersin.org/articles/10.3389/fgene.2021.705038/full |
work_keys_str_mv |
AT guoqingliu epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae AT guoqingliu epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae AT shuangjiansong epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae AT qiguozhang epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae AT biyudong epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae AT yusun epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae AT guojunliu epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae AT guojunliu epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae AT xiujuanzhao epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae AT xiujuanzhao epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae |
_version_ |
1721354715508244480 |