Finding Colon Cancer- and Colorectal Cancer-Related Microbes Based on Microbe–Disease Association Prediction

Microbes are closely associated with the formation and development of diseases. The identification of the potential associations between microbes and diseases can boost the understanding of various complex diseases. Wet experiments applied to microbe–disease association (MDA) identification are cost...

Full description

Bibliographic Details
Main Authors: Yu Chen, Hongjian Sun, Mengzhe Sun, Changguo Shi, Hongmei Sun, Xiaoli Shi, Binbin Ji, Jinpeng Cui
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-03-01
Series:Frontiers in Microbiology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fmicb.2021.650056/full
id doaj-330878f185114431b05907aa373272a3
record_format Article
spelling doaj-330878f185114431b05907aa373272a32021-03-16T05:45:36ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2021-03-011210.3389/fmicb.2021.650056650056Finding Colon Cancer- and Colorectal Cancer-Related Microbes Based on Microbe–Disease Association PredictionYu Chen0Hongjian Sun1Mengzhe Sun2Changguo Shi3Hongmei Sun4Xiaoli Shi5Xiaoli Shi6Binbin Ji7Binbin Ji8Jinpeng Cui9The Cancer Hospital of Jia Mu Si, Jiamusi, ChinaOncological Surgery, The Central Hospital of Jia Mu Si, Jiamusi, ChinaOncological Surgery, The Central Hospital of Jia Mu Si, Jiamusi, ChinaDepartment of Thoracic Surgery, The Cancer Hospital of Jia Mu Si, Jiamusi, ChinaMedical Oncology, The Cancer Hospital of Jia Mu Si, Jiamusi, ChinaGeneis Beijing Co., Ltd., Beijing, ChinaQingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, ChinaGeneis Beijing Co., Ltd., Beijing, ChinaQingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, ChinaDepartment of Laboratory Medicine, Yantaishan Hospital of Yantai City, Yantai, ChinaMicrobes are closely associated with the formation and development of diseases. The identification of the potential associations between microbes and diseases can boost the understanding of various complex diseases. Wet experiments applied to microbe–disease association (MDA) identification are costly and time-consuming. In this manuscript, we developed a novel computational model, NLLMDA, to find unobserved MDAs, especially for colon cancer and colorectal carcinoma. NLLMDA integrated negative MDA selection, linear neighborhood similarity, label propagation, information integration, and known biological data. The Gaussian association profile (GAP) similarity of microbes and GAPs similarity and symptom similarity of diseases were firstly computed. Secondly, linear neighborhood method was then applied to the above computed similarity matrices to obtain more stable performance. Thirdly, negative MDA samples were selected, and the label propagation algorithm was used to score for microbe–disease pairs. The final association probabilities can be computed based on the information integration method. NLLMDA was compared with the other five classical MDA methods and obtained the highest area under the curve (AUC) value of 0.9031 and 0.9335 on cross-validations of diseases and microbe–disease pairs. The results suggest that NLLMDA was an effective prediction method. More importantly, we found that Acidobacteriaceae may have a close link with colon cancer and Tannerella may densely associate with colorectal carcinoma.https://www.frontiersin.org/articles/10.3389/fmicb.2021.650056/fullmicrobe–disease associationnegative sample selectionlinear neighborhood similaritylabel propagationinformation integrationcolon cancer
collection DOAJ
language English
format Article
sources DOAJ
author Yu Chen
Hongjian Sun
Mengzhe Sun
Changguo Shi
Hongmei Sun
Xiaoli Shi
Xiaoli Shi
Binbin Ji
Binbin Ji
Jinpeng Cui
spellingShingle Yu Chen
Hongjian Sun
Mengzhe Sun
Changguo Shi
Hongmei Sun
Xiaoli Shi
Xiaoli Shi
Binbin Ji
Binbin Ji
Jinpeng Cui
Finding Colon Cancer- and Colorectal Cancer-Related Microbes Based on Microbe–Disease Association Prediction
Frontiers in Microbiology
microbe–disease association
negative sample selection
linear neighborhood similarity
label propagation
information integration
colon cancer
author_facet Yu Chen
Hongjian Sun
Mengzhe Sun
Changguo Shi
Hongmei Sun
Xiaoli Shi
Xiaoli Shi
Binbin Ji
Binbin Ji
Jinpeng Cui
author_sort Yu Chen
title Finding Colon Cancer- and Colorectal Cancer-Related Microbes Based on Microbe–Disease Association Prediction
title_short Finding Colon Cancer- and Colorectal Cancer-Related Microbes Based on Microbe–Disease Association Prediction
title_full Finding Colon Cancer- and Colorectal Cancer-Related Microbes Based on Microbe–Disease Association Prediction
title_fullStr Finding Colon Cancer- and Colorectal Cancer-Related Microbes Based on Microbe–Disease Association Prediction
title_full_unstemmed Finding Colon Cancer- and Colorectal Cancer-Related Microbes Based on Microbe–Disease Association Prediction
title_sort finding colon cancer- and colorectal cancer-related microbes based on microbe–disease association prediction
publisher Frontiers Media S.A.
series Frontiers in Microbiology
issn 1664-302X
publishDate 2021-03-01
description Microbes are closely associated with the formation and development of diseases. The identification of the potential associations between microbes and diseases can boost the understanding of various complex diseases. Wet experiments applied to microbe–disease association (MDA) identification are costly and time-consuming. In this manuscript, we developed a novel computational model, NLLMDA, to find unobserved MDAs, especially for colon cancer and colorectal carcinoma. NLLMDA integrated negative MDA selection, linear neighborhood similarity, label propagation, information integration, and known biological data. The Gaussian association profile (GAP) similarity of microbes and GAPs similarity and symptom similarity of diseases were firstly computed. Secondly, linear neighborhood method was then applied to the above computed similarity matrices to obtain more stable performance. Thirdly, negative MDA samples were selected, and the label propagation algorithm was used to score for microbe–disease pairs. The final association probabilities can be computed based on the information integration method. NLLMDA was compared with the other five classical MDA methods and obtained the highest area under the curve (AUC) value of 0.9031 and 0.9335 on cross-validations of diseases and microbe–disease pairs. The results suggest that NLLMDA was an effective prediction method. More importantly, we found that Acidobacteriaceae may have a close link with colon cancer and Tannerella may densely associate with colorectal carcinoma.
topic microbe–disease association
negative sample selection
linear neighborhood similarity
label propagation
information integration
colon cancer
url https://www.frontiersin.org/articles/10.3389/fmicb.2021.650056/full
work_keys_str_mv AT yuchen findingcoloncancerandcolorectalcancerrelatedmicrobesbasedonmicrobediseaseassociationprediction
AT hongjiansun findingcoloncancerandcolorectalcancerrelatedmicrobesbasedonmicrobediseaseassociationprediction
AT mengzhesun findingcoloncancerandcolorectalcancerrelatedmicrobesbasedonmicrobediseaseassociationprediction
AT changguoshi findingcoloncancerandcolorectalcancerrelatedmicrobesbasedonmicrobediseaseassociationprediction
AT hongmeisun findingcoloncancerandcolorectalcancerrelatedmicrobesbasedonmicrobediseaseassociationprediction
AT xiaolishi findingcoloncancerandcolorectalcancerrelatedmicrobesbasedonmicrobediseaseassociationprediction
AT xiaolishi findingcoloncancerandcolorectalcancerrelatedmicrobesbasedonmicrobediseaseassociationprediction
AT binbinji findingcoloncancerandcolorectalcancerrelatedmicrobesbasedonmicrobediseaseassociationprediction
AT binbinji findingcoloncancerandcolorectalcancerrelatedmicrobesbasedonmicrobediseaseassociationprediction
AT jinpengcui findingcoloncancerandcolorectalcancerrelatedmicrobesbasedonmicrobediseaseassociationprediction
_version_ 1724219968884572160