CODA-ML: context-specific biological knowledge representation for systemic physiology analysis

Abstract Background Computational analysis of complex diseases involving multiple organs requires the integration of multiple different models into a unified model. Different models are often constructed in heterogeneous formats. Thus, the integration of the models requires a standard language forma...

Full description

Bibliographic Details
Main Authors: Mijin Kwon, Soorin Yim, Gwangmin Kim, Saehwan Lee, Chungsun Jeong, Doheon Lee
Format: Article
Language:English
Published: BMC 2019-05-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-019-2812-7
id doaj-6fa0c755110a406cafebfbad9c799af3
record_format Article
spelling doaj-6fa0c755110a406cafebfbad9c799af32020-11-25T03:25:15ZengBMCBMC Bioinformatics1471-21052019-05-0120S10455310.1186/s12859-019-2812-7CODA-ML: context-specific biological knowledge representation for systemic physiology analysisMijin Kwon0Soorin Yim1Gwangmin Kim2Saehwan Lee3Chungsun Jeong4Doheon Lee5Department of Bio and Brain Engineering, KAISTDepartment of Bio and Brain Engineering, KAISTDepartment of Bio and Brain Engineering, KAISTDepartment of Bio and Brain Engineering, KAISTDepartment of Bio and Brain Engineering, KAISTDepartment of Bio and Brain Engineering, KAISTAbstract Background Computational analysis of complex diseases involving multiple organs requires the integration of multiple different models into a unified model. Different models are often constructed in heterogeneous formats. Thus, the integration of the models requires a standard language format that can effectively represent essential biological information. However, the previously introduced formats have limitations that prevent from adequately representing essential biological information, particularly specifications of bio-molecules and biological contexts. Results We defined an XML-based markup language called context-oriented directed association markup language (CODA-ML), which better represents essential biological information. The CODA-ML has two major strengths in designating molecular specifications and biological contexts. It can cover heterogeneous entity types involved in biological events (e.g. gene/protein, compound, cellular function, disease). Molecular types of entities can have molecular specifications which include detailed information of a molecule from isoforms to modifications, enabling high-resolution representation of molecules. In addition, it can distinguish biological events that vary depending on different biological contexts such as cell types or disease conditions. Especially representation of inter-cellular events as well as intra-cellular events is available. These two major strengths can resolve contradictory associations when different models are integrated into one unified model, which improves the accuracy of the model. Conclusions With the CODA-ML, diverse models such as signaling pathways, metabolic pathways, and gene regulatory pathways can be represented in a unified language format. Heterogeneous entity types can be covered by the CODA-ML, thus it enables detailed description for the mechanisms of diseases or drugs from multiple perspectives (e.g., molecule, function or disease). The CODA-ML is expected to help integrate different models into one systemic model in an efficient and effective. The unified model can be used to perform computational analysis not only for cancer but also for other complex diseases involving multiple organs beyond a single cell.http://link.springer.com/article/10.1186/s12859-019-2812-7Biological knowledgeEssential biological informationMolecular specificationBiological contextStandard language
collection DOAJ
language English
format Article
sources DOAJ
author Mijin Kwon
Soorin Yim
Gwangmin Kim
Saehwan Lee
Chungsun Jeong
Doheon Lee
spellingShingle Mijin Kwon
Soorin Yim
Gwangmin Kim
Saehwan Lee
Chungsun Jeong
Doheon Lee
CODA-ML: context-specific biological knowledge representation for systemic physiology analysis
BMC Bioinformatics
Biological knowledge
Essential biological information
Molecular specification
Biological context
Standard language
author_facet Mijin Kwon
Soorin Yim
Gwangmin Kim
Saehwan Lee
Chungsun Jeong
Doheon Lee
author_sort Mijin Kwon
title CODA-ML: context-specific biological knowledge representation for systemic physiology analysis
title_short CODA-ML: context-specific biological knowledge representation for systemic physiology analysis
title_full CODA-ML: context-specific biological knowledge representation for systemic physiology analysis
title_fullStr CODA-ML: context-specific biological knowledge representation for systemic physiology analysis
title_full_unstemmed CODA-ML: context-specific biological knowledge representation for systemic physiology analysis
title_sort coda-ml: context-specific biological knowledge representation for systemic physiology analysis
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2019-05-01
description Abstract Background Computational analysis of complex diseases involving multiple organs requires the integration of multiple different models into a unified model. Different models are often constructed in heterogeneous formats. Thus, the integration of the models requires a standard language format that can effectively represent essential biological information. However, the previously introduced formats have limitations that prevent from adequately representing essential biological information, particularly specifications of bio-molecules and biological contexts. Results We defined an XML-based markup language called context-oriented directed association markup language (CODA-ML), which better represents essential biological information. The CODA-ML has two major strengths in designating molecular specifications and biological contexts. It can cover heterogeneous entity types involved in biological events (e.g. gene/protein, compound, cellular function, disease). Molecular types of entities can have molecular specifications which include detailed information of a molecule from isoforms to modifications, enabling high-resolution representation of molecules. In addition, it can distinguish biological events that vary depending on different biological contexts such as cell types or disease conditions. Especially representation of inter-cellular events as well as intra-cellular events is available. These two major strengths can resolve contradictory associations when different models are integrated into one unified model, which improves the accuracy of the model. Conclusions With the CODA-ML, diverse models such as signaling pathways, metabolic pathways, and gene regulatory pathways can be represented in a unified language format. Heterogeneous entity types can be covered by the CODA-ML, thus it enables detailed description for the mechanisms of diseases or drugs from multiple perspectives (e.g., molecule, function or disease). The CODA-ML is expected to help integrate different models into one systemic model in an efficient and effective. The unified model can be used to perform computational analysis not only for cancer but also for other complex diseases involving multiple organs beyond a single cell.
topic Biological knowledge
Essential biological information
Molecular specification
Biological context
Standard language
url http://link.springer.com/article/10.1186/s12859-019-2812-7
work_keys_str_mv AT mijinkwon codamlcontextspecificbiologicalknowledgerepresentationforsystemicphysiologyanalysis
AT soorinyim codamlcontextspecificbiologicalknowledgerepresentationforsystemicphysiologyanalysis
AT gwangminkim codamlcontextspecificbiologicalknowledgerepresentationforsystemicphysiologyanalysis
AT saehwanlee codamlcontextspecificbiologicalknowledgerepresentationforsystemicphysiologyanalysis
AT chungsunjeong codamlcontextspecificbiologicalknowledgerepresentationforsystemicphysiologyanalysis
AT doheonlee codamlcontextspecificbiologicalknowledgerepresentationforsystemicphysiologyanalysis
_version_ 1724598021942935552