6mAPred-MSFF: A Deep Learning Model for Predicting DNA N6-Methyladenine Sites across Species Based on a Multi-Scale Feature Fusion Mechanism
DNA methylation is one of the most extensive epigenetic modifications. DNA N6-methyladenine (6mA) plays a key role in many biology regulation processes. An accurate and reliable genome-wide identification of 6mA sites is crucial for systematically understanding its biological functions. Some machine...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-08-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/11/16/7731 |
id |
doaj-faf799d689054d73b184bc57185936f9 |
---|---|
record_format |
Article |
spelling |
doaj-faf799d689054d73b184bc57185936f92021-08-26T13:31:10ZengMDPI AGApplied Sciences2076-34172021-08-01117731773110.3390/app111677316mAPred-MSFF: A Deep Learning Model for Predicting DNA N6-Methyladenine Sites across Species Based on a Multi-Scale Feature Fusion MechanismRao Zeng0Minghong Liao1Department of Software Engineering, School of Informatics, Xiamen University, Xiamen 361005, ChinaDepartment of Software Engineering, School of Informatics, Xiamen University, Xiamen 361005, ChinaDNA methylation is one of the most extensive epigenetic modifications. DNA N6-methyladenine (6mA) plays a key role in many biology regulation processes. An accurate and reliable genome-wide identification of 6mA sites is crucial for systematically understanding its biological functions. Some machine learning tools can identify 6mA sites, but their limited prediction accuracy and lack of robustness limit their usability in epigenetic studies, which implies the great need of developing new computational methods for this problem. In this paper, we developed a novel computational predictor, namely the 6mAPred-MSFF, which is a deep learning framework based on a multi-scale feature fusion mechanism to identify 6mA sites across different species. In the predictor, we integrate the inverted residual block and multi-scale attention mechanism to build lightweight and deep neural networks. As compared to existing predictors using traditional machine learning, our deep learning framework needs no prior knowledge of 6mA or manually crafted sequence features and sufficiently capture better characteristics of 6mA sites. By benchmarking comparison, our deep learning method outperforms the state-of-the-art methods on the 5-fold cross-validation test on the seven datasets of six species, demonstrating that the proposed 6mAPred-MSFF is more effective and generic. Specifically, our proposed 6mAPred-MSFF gives the sensitivity and specificity of the 5-fold cross-validation on the 6mA-rice-Lv dataset as 97.88% and 94.64%, respectively. Our model trained with the rice data predicts well the 6mA sites of other five species: <i>Arabidopsis thaliana</i>, <i>Fragaria vesca</i>, <i>Rosa chinensis</i>, <i>Homo sapiens</i>, and <i>Drosophila melanogaster</i> with a prediction accuracy 98.51%, 93.02%, and 91.53%, respectively. Moreover, via experimental comparison, we explored performance impact by training and testing our proposed model under different encoding schemes and feature descriptors.https://www.mdpi.com/2076-3417/11/16/7731DNA N6-methyladeninedeep learningsite predictiondepthwise separable convolutioninverted residual structureattention mechanism |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Rao Zeng Minghong Liao |
spellingShingle |
Rao Zeng Minghong Liao 6mAPred-MSFF: A Deep Learning Model for Predicting DNA N6-Methyladenine Sites across Species Based on a Multi-Scale Feature Fusion Mechanism Applied Sciences DNA N6-methyladenine deep learning site prediction depthwise separable convolution inverted residual structure attention mechanism |
author_facet |
Rao Zeng Minghong Liao |
author_sort |
Rao Zeng |
title |
6mAPred-MSFF: A Deep Learning Model for Predicting DNA N6-Methyladenine Sites across Species Based on a Multi-Scale Feature Fusion Mechanism |
title_short |
6mAPred-MSFF: A Deep Learning Model for Predicting DNA N6-Methyladenine Sites across Species Based on a Multi-Scale Feature Fusion Mechanism |
title_full |
6mAPred-MSFF: A Deep Learning Model for Predicting DNA N6-Methyladenine Sites across Species Based on a Multi-Scale Feature Fusion Mechanism |
title_fullStr |
6mAPred-MSFF: A Deep Learning Model for Predicting DNA N6-Methyladenine Sites across Species Based on a Multi-Scale Feature Fusion Mechanism |
title_full_unstemmed |
6mAPred-MSFF: A Deep Learning Model for Predicting DNA N6-Methyladenine Sites across Species Based on a Multi-Scale Feature Fusion Mechanism |
title_sort |
6mapred-msff: a deep learning model for predicting dna n6-methyladenine sites across species based on a multi-scale feature fusion mechanism |
publisher |
MDPI AG |
series |
Applied Sciences |
issn |
2076-3417 |
publishDate |
2021-08-01 |
description |
DNA methylation is one of the most extensive epigenetic modifications. DNA N6-methyladenine (6mA) plays a key role in many biology regulation processes. An accurate and reliable genome-wide identification of 6mA sites is crucial for systematically understanding its biological functions. Some machine learning tools can identify 6mA sites, but their limited prediction accuracy and lack of robustness limit their usability in epigenetic studies, which implies the great need of developing new computational methods for this problem. In this paper, we developed a novel computational predictor, namely the 6mAPred-MSFF, which is a deep learning framework based on a multi-scale feature fusion mechanism to identify 6mA sites across different species. In the predictor, we integrate the inverted residual block and multi-scale attention mechanism to build lightweight and deep neural networks. As compared to existing predictors using traditional machine learning, our deep learning framework needs no prior knowledge of 6mA or manually crafted sequence features and sufficiently capture better characteristics of 6mA sites. By benchmarking comparison, our deep learning method outperforms the state-of-the-art methods on the 5-fold cross-validation test on the seven datasets of six species, demonstrating that the proposed 6mAPred-MSFF is more effective and generic. Specifically, our proposed 6mAPred-MSFF gives the sensitivity and specificity of the 5-fold cross-validation on the 6mA-rice-Lv dataset as 97.88% and 94.64%, respectively. Our model trained with the rice data predicts well the 6mA sites of other five species: <i>Arabidopsis thaliana</i>, <i>Fragaria vesca</i>, <i>Rosa chinensis</i>, <i>Homo sapiens</i>, and <i>Drosophila melanogaster</i> with a prediction accuracy 98.51%, 93.02%, and 91.53%, respectively. Moreover, via experimental comparison, we explored performance impact by training and testing our proposed model under different encoding schemes and feature descriptors. |
topic |
DNA N6-methyladenine deep learning site prediction depthwise separable convolution inverted residual structure attention mechanism |
url |
https://www.mdpi.com/2076-3417/11/16/7731 |
work_keys_str_mv |
AT raozeng 6mapredmsffadeeplearningmodelforpredictingdnan6methyladeninesitesacrossspeciesbasedonamultiscalefeaturefusionmechanism AT minghongliao 6mapredmsffadeeplearningmodelforpredictingdnan6methyladeninesitesacrossspeciesbasedonamultiscalefeaturefusionmechanism |
_version_ |
1721194982696550400 |