Identification of 11 candidate structured noncoding RNA motifs in humans by comparative genomics
Abstract Background Only 1.5% of the human genome encodes proteins, while large part of the remaining encodes noncoding RNAs (ncRNA). Many ncRNAs form structures and perform many important functions. Accurately identifying structured ncRNAs in the human genome and discovering their biological functi...
Main Authors: | , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2021-03-01
|
Series: | BMC Genomics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12864-021-07474-9 |
id |
doaj-7cf62bd1d4354d7498b0ba7a917d5c44 |
---|---|
record_format |
Article |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Lijuan Hou Jin Xie Yaoyao Wu Jiaojiao Wang Anqi Duan Yaqi Ao Xuejiao Liu Xinmei Yu Hui Yan Jonathan Perreault Sanshu Li |
spellingShingle |
Lijuan Hou Jin Xie Yaoyao Wu Jiaojiao Wang Anqi Duan Yaqi Ao Xuejiao Liu Xinmei Yu Hui Yan Jonathan Perreault Sanshu Li Identification of 11 candidate structured noncoding RNA motifs in humans by comparative genomics BMC Genomics Comparative genomics Structured ncRNAs Human genomes Animal genomes Pipeline |
author_facet |
Lijuan Hou Jin Xie Yaoyao Wu Jiaojiao Wang Anqi Duan Yaqi Ao Xuejiao Liu Xinmei Yu Hui Yan Jonathan Perreault Sanshu Li |
author_sort |
Lijuan Hou |
title |
Identification of 11 candidate structured noncoding RNA motifs in humans by comparative genomics |
title_short |
Identification of 11 candidate structured noncoding RNA motifs in humans by comparative genomics |
title_full |
Identification of 11 candidate structured noncoding RNA motifs in humans by comparative genomics |
title_fullStr |
Identification of 11 candidate structured noncoding RNA motifs in humans by comparative genomics |
title_full_unstemmed |
Identification of 11 candidate structured noncoding RNA motifs in humans by comparative genomics |
title_sort |
identification of 11 candidate structured noncoding rna motifs in humans by comparative genomics |
publisher |
BMC |
series |
BMC Genomics |
issn |
1471-2164 |
publishDate |
2021-03-01 |
description |
Abstract Background Only 1.5% of the human genome encodes proteins, while large part of the remaining encodes noncoding RNAs (ncRNA). Many ncRNAs form structures and perform many important functions. Accurately identifying structured ncRNAs in the human genome and discovering their biological functions remain a major challenge. Results Here, we have established a pipeline (CM-line) with the following features for analyzing the large genomes of humans and other animals. First, we selected species with larger genetic distances to facilitate the discovery of covariations and compatible mutations. Second, we used CMfinder, which can generate useful alignments even with low sequence conservation. Third, we removed repetitive sequences and known structured ncRNAs to reduce the workload of CMfinder. Fourth, we used Infernal to find more representatives and refine the structure. We reported 11 classes of structured ncRNA candidates with significant covariations in humans. Functional analysis showed that these ncRNAs may have variable functions. Some may regulate circadian clock genes through poly (A) signals (PAS); some may regulate the elongation factor (EEF1A) and the T-cell receptor signaling pathway by cooperating with RNA binding proteins. Conclusions By searching for important features of RNA structure from large genomes, the CM-line has revealed the existence of a variety of novel structured ncRNAs. Functional analysis suggests that some newly discovered ncRNA motifs may have biological functions. The pipeline we have established for the discovery of structured ncRNAs and the identification of their functions can also be applied to analyze other large genomes. |
topic |
Comparative genomics Structured ncRNAs Human genomes Animal genomes Pipeline |
url |
https://doi.org/10.1186/s12864-021-07474-9 |
work_keys_str_mv |
AT lijuanhou identificationof11candidatestructurednoncodingrnamotifsinhumansbycomparativegenomics AT jinxie identificationof11candidatestructurednoncodingrnamotifsinhumansbycomparativegenomics AT yaoyaowu identificationof11candidatestructurednoncodingrnamotifsinhumansbycomparativegenomics AT jiaojiaowang identificationof11candidatestructurednoncodingrnamotifsinhumansbycomparativegenomics AT anqiduan identificationof11candidatestructurednoncodingrnamotifsinhumansbycomparativegenomics AT yaqiao identificationof11candidatestructurednoncodingrnamotifsinhumansbycomparativegenomics AT xuejiaoliu identificationof11candidatestructurednoncodingrnamotifsinhumansbycomparativegenomics AT xinmeiyu identificationof11candidatestructurednoncodingrnamotifsinhumansbycomparativegenomics AT huiyan identificationof11candidatestructurednoncodingrnamotifsinhumansbycomparativegenomics AT jonathanperreault identificationof11candidatestructurednoncodingrnamotifsinhumansbycomparativegenomics AT sanshuli identificationof11candidatestructurednoncodingrnamotifsinhumansbycomparativegenomics |
_version_ |
1724225081980223488 |
spelling |
doaj-7cf62bd1d4354d7498b0ba7a917d5c442021-03-11T11:53:56ZengBMCBMC Genomics1471-21642021-03-0122111410.1186/s12864-021-07474-9Identification of 11 candidate structured noncoding RNA motifs in humans by comparative genomicsLijuan Hou0Jin Xie1Yaoyao Wu2Jiaojiao Wang3Anqi Duan4Yaqi Ao5Xuejiao Liu6Xinmei Yu7Hui Yan8Jonathan Perreault9Sanshu Li10Medical School, Molecular Medicine Engineering and Research Center of Ministry of Education, Key Laboratory of Precision Medicine and Molecular Diagnosis of Fujian Universities, Institute of Genomics, School of Biomedical Sciences, Huaqiao UniversityMedical School, Molecular Medicine Engineering and Research Center of Ministry of Education, Key Laboratory of Precision Medicine and Molecular Diagnosis of Fujian Universities, Institute of Genomics, School of Biomedical Sciences, Huaqiao UniversityMedical School, Molecular Medicine Engineering and Research Center of Ministry of Education, Key Laboratory of Precision Medicine and Molecular Diagnosis of Fujian Universities, Institute of Genomics, School of Biomedical Sciences, Huaqiao UniversityMedical School, Molecular Medicine Engineering and Research Center of Ministry of Education, Key Laboratory of Precision Medicine and Molecular Diagnosis of Fujian Universities, Institute of Genomics, School of Biomedical Sciences, Huaqiao UniversityMedical School, Molecular Medicine Engineering and Research Center of Ministry of Education, Key Laboratory of Precision Medicine and Molecular Diagnosis of Fujian Universities, Institute of Genomics, School of Biomedical Sciences, Huaqiao UniversityMedical School, Molecular Medicine Engineering and Research Center of Ministry of Education, Key Laboratory of Precision Medicine and Molecular Diagnosis of Fujian Universities, Institute of Genomics, School of Biomedical Sciences, Huaqiao UniversityMedical School, Molecular Medicine Engineering and Research Center of Ministry of Education, Key Laboratory of Precision Medicine and Molecular Diagnosis of Fujian Universities, Institute of Genomics, School of Biomedical Sciences, Huaqiao UniversityMedical School, Molecular Medicine Engineering and Research Center of Ministry of Education, Key Laboratory of Precision Medicine and Molecular Diagnosis of Fujian Universities, Institute of Genomics, School of Biomedical Sciences, Huaqiao UniversityMedical School, Molecular Medicine Engineering and Research Center of Ministry of Education, Key Laboratory of Precision Medicine and Molecular Diagnosis of Fujian Universities, Institute of Genomics, School of Biomedical Sciences, Huaqiao UniversityINRS - Institut Armand-FrappierMedical School, Molecular Medicine Engineering and Research Center of Ministry of Education, Key Laboratory of Precision Medicine and Molecular Diagnosis of Fujian Universities, Institute of Genomics, School of Biomedical Sciences, Huaqiao UniversityAbstract Background Only 1.5% of the human genome encodes proteins, while large part of the remaining encodes noncoding RNAs (ncRNA). Many ncRNAs form structures and perform many important functions. Accurately identifying structured ncRNAs in the human genome and discovering their biological functions remain a major challenge. Results Here, we have established a pipeline (CM-line) with the following features for analyzing the large genomes of humans and other animals. First, we selected species with larger genetic distances to facilitate the discovery of covariations and compatible mutations. Second, we used CMfinder, which can generate useful alignments even with low sequence conservation. Third, we removed repetitive sequences and known structured ncRNAs to reduce the workload of CMfinder. Fourth, we used Infernal to find more representatives and refine the structure. We reported 11 classes of structured ncRNA candidates with significant covariations in humans. Functional analysis showed that these ncRNAs may have variable functions. Some may regulate circadian clock genes through poly (A) signals (PAS); some may regulate the elongation factor (EEF1A) and the T-cell receptor signaling pathway by cooperating with RNA binding proteins. Conclusions By searching for important features of RNA structure from large genomes, the CM-line has revealed the existence of a variety of novel structured ncRNAs. Functional analysis suggests that some newly discovered ncRNA motifs may have biological functions. The pipeline we have established for the discovery of structured ncRNAs and the identification of their functions can also be applied to analyze other large genomes.https://doi.org/10.1186/s12864-021-07474-9Comparative genomicsStructured ncRNAsHuman genomesAnimal genomesPipeline |