Improved Identification of Small Open Reading Frames Encoded Peptides by Top-Down Proteomic Approaches and De Novo Sequencing
Small open reading frames (sORFs) have translational potential to produce peptides that play essential roles in various biological processes. Nevertheless, many sORF-encoded peptides (SEPs) are still on the prediction level. Here, we construct a strategy to analyze SEPs by combining top-down and de...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-05-01
|
Series: | International Journal of Molecular Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/1422-0067/22/11/5476 |
id |
doaj-8163bc5fea274ddf8073d304f2fe2320 |
---|---|
record_format |
Article |
spelling |
doaj-8163bc5fea274ddf8073d304f2fe23202021-06-01T00:49:16ZengMDPI AGInternational Journal of Molecular Sciences1661-65961422-00672021-05-01225476547610.3390/ijms22115476Improved Identification of Small Open Reading Frames Encoded Peptides by Top-Down Proteomic Approaches and De Novo SequencingBing Wang0Zhiwei Wang1Ni Pan2Jiangmei Huang3Cuihong Wan4Hubei Key Lab of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, No. 152 Luoyu Road, Wuhan 430079, ChinaHubei Key Lab of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, No. 152 Luoyu Road, Wuhan 430079, ChinaHubei Key Lab of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, No. 152 Luoyu Road, Wuhan 430079, ChinaHubei Key Lab of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, No. 152 Luoyu Road, Wuhan 430079, ChinaHubei Key Lab of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, No. 152 Luoyu Road, Wuhan 430079, ChinaSmall open reading frames (sORFs) have translational potential to produce peptides that play essential roles in various biological processes. Nevertheless, many sORF-encoded peptides (SEPs) are still on the prediction level. Here, we construct a strategy to analyze SEPs by combining top-down and de novo sequencing to improve SEP identification and sequence coverage. With de novo sequencing, we identified 1682 peptides mapping to 2544 human sORFs, which were all first characterized in this work. Two-thirds of these new sORFs have reading frame shifts and use a non-ATG start codon. The top-down approach identified 241 human SEPs, with high sequence coverage. The average length of the peptides from the bottom-up database search was 19 amino acids (AA); from de novo sequencing, it was 9 AA; and from the top-down approach, it was 25 AA. The longer peptide positively boosts the sequence coverage, more efficiently distinguishing SEPs from the known gene coding sequence. Top-down has the advantage of identifying peptides with sequential K/R or high K/R content, which is unfavorable in the bottom-up approach. Our method can explore new coding sORFs and obtain highly accurate sequences of their SEPs, which can also benefit future function research.https://www.mdpi.com/1422-0067/22/11/5476sORF-encoded peptidesde novo sequencingtop-downnon-ATG start codonsequence coverage |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Bing Wang Zhiwei Wang Ni Pan Jiangmei Huang Cuihong Wan |
spellingShingle |
Bing Wang Zhiwei Wang Ni Pan Jiangmei Huang Cuihong Wan Improved Identification of Small Open Reading Frames Encoded Peptides by Top-Down Proteomic Approaches and De Novo Sequencing International Journal of Molecular Sciences sORF-encoded peptides de novo sequencing top-down non-ATG start codon sequence coverage |
author_facet |
Bing Wang Zhiwei Wang Ni Pan Jiangmei Huang Cuihong Wan |
author_sort |
Bing Wang |
title |
Improved Identification of Small Open Reading Frames Encoded Peptides by Top-Down Proteomic Approaches and De Novo Sequencing |
title_short |
Improved Identification of Small Open Reading Frames Encoded Peptides by Top-Down Proteomic Approaches and De Novo Sequencing |
title_full |
Improved Identification of Small Open Reading Frames Encoded Peptides by Top-Down Proteomic Approaches and De Novo Sequencing |
title_fullStr |
Improved Identification of Small Open Reading Frames Encoded Peptides by Top-Down Proteomic Approaches and De Novo Sequencing |
title_full_unstemmed |
Improved Identification of Small Open Reading Frames Encoded Peptides by Top-Down Proteomic Approaches and De Novo Sequencing |
title_sort |
improved identification of small open reading frames encoded peptides by top-down proteomic approaches and de novo sequencing |
publisher |
MDPI AG |
series |
International Journal of Molecular Sciences |
issn |
1661-6596 1422-0067 |
publishDate |
2021-05-01 |
description |
Small open reading frames (sORFs) have translational potential to produce peptides that play essential roles in various biological processes. Nevertheless, many sORF-encoded peptides (SEPs) are still on the prediction level. Here, we construct a strategy to analyze SEPs by combining top-down and de novo sequencing to improve SEP identification and sequence coverage. With de novo sequencing, we identified 1682 peptides mapping to 2544 human sORFs, which were all first characterized in this work. Two-thirds of these new sORFs have reading frame shifts and use a non-ATG start codon. The top-down approach identified 241 human SEPs, with high sequence coverage. The average length of the peptides from the bottom-up database search was 19 amino acids (AA); from de novo sequencing, it was 9 AA; and from the top-down approach, it was 25 AA. The longer peptide positively boosts the sequence coverage, more efficiently distinguishing SEPs from the known gene coding sequence. Top-down has the advantage of identifying peptides with sequential K/R or high K/R content, which is unfavorable in the bottom-up approach. Our method can explore new coding sORFs and obtain highly accurate sequences of their SEPs, which can also benefit future function research. |
topic |
sORF-encoded peptides de novo sequencing top-down non-ATG start codon sequence coverage |
url |
https://www.mdpi.com/1422-0067/22/11/5476 |
work_keys_str_mv |
AT bingwang improvedidentificationofsmallopenreadingframesencodedpeptidesbytopdownproteomicapproachesanddenovosequencing AT zhiweiwang improvedidentificationofsmallopenreadingframesencodedpeptidesbytopdownproteomicapproachesanddenovosequencing AT nipan improvedidentificationofsmallopenreadingframesencodedpeptidesbytopdownproteomicapproachesanddenovosequencing AT jiangmeihuang improvedidentificationofsmallopenreadingframesencodedpeptidesbytopdownproteomicapproachesanddenovosequencing AT cuihongwan improvedidentificationofsmallopenreadingframesencodedpeptidesbytopdownproteomicapproachesanddenovosequencing |
_version_ |
1721413761387986944 |