Data Mining Methods for Omics and Knowledge of Crude Medicinal Plants toward Big Data Biology
Molecular biological data has rapidly increased with the recent progress of the Omics fields, e.g., genomics, transcriptomics, proteomics and metabolomics that necessitates the development of databases and methods for efficient storage, retrieval, integration and analysis of massive data. The presen...
Main Authors: | , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2013-01-01
|
Series: | Computational and Structural Biotechnology Journal |
Online Access: | http://journals.sfu.ca/rncsb/index.php/csbj/article/view/csbj.201301010 |
id |
doaj-8d329981f26a46abb8454ec1288c1c4b |
---|---|
record_format |
Article |
spelling |
doaj-8d329981f26a46abb8454ec1288c1c4b2020-11-24T20:54:25ZengElsevierComputational and Structural Biotechnology Journal2001-03702013-01-0145e201301010Data Mining Methods for Omics and Knowledge of Crude Medicinal Plants toward Big Data BiologyFarit M. AfendiNaoaki OnoLatifah K. DarusmanKensuke NakamuraYukiko NakamuraNelson KibingeAki Hirai MoritaHisayuki HoraiMd. Altaf-Ul-AminShigehiko KanayaKen TanakaMolecular biological data has rapidly increased with the recent progress of the Omics fields, e.g., genomics, transcriptomics, proteomics and metabolomics that necessitates the development of databases and methods for efficient storage, retrieval, integration and analysis of massive data. The present study reviews the usage of KNApSAcK Family DB in metabolomics and related area, discusses several statistical methods for handling multivariate data and shows their application on Indonesian blended herbal medicines (Jamu) as a case study. Exploration using Biplot reveals many plants are rarely utilized while some plants are highly utilized toward specific efficacy. Furthermore, the ingredients of Jamu formulas are modeled using Partial Least Squares Discriminant Analysis (PLS-DA) in order to predict their efficacy. The plants used in each Jamu medicine served as the predictors, whereas the efficacy of each Jamu provided the responses. This model produces 71.6% correct classification in predicting efficacy. Permutation test then is used to determine plants that serve as main ingredients in Jamu formula by evaluating the significance of the PLS-DA coefficients. Next, in order to explain the role of plants that serve as main ingredients in Jamu medicines, information of pharmacological activity of the plants is added to the predictor block. Then N-PLS-DA model, multiway version of PLS-DA, is utilized to handle the three-dimensional array of the predictor block. The resulting N-PLS-DA model reveals that the effects of some pharmacological activities are specific for certain efficacy and the other activities are diverse toward many efficacies. Mathematical modeling introduced in the present study can be utilized in global analysis of big data targeting to reveal the underlying biology.http://journals.sfu.ca/rncsb/index.php/csbj/article/view/csbj.201301010 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Farit M. Afendi Naoaki Ono Latifah K. Darusman Kensuke Nakamura Yukiko Nakamura Nelson Kibinge Aki Hirai Morita Hisayuki Horai Md. Altaf-Ul-Amin Shigehiko Kanaya Ken Tanaka |
spellingShingle |
Farit M. Afendi Naoaki Ono Latifah K. Darusman Kensuke Nakamura Yukiko Nakamura Nelson Kibinge Aki Hirai Morita Hisayuki Horai Md. Altaf-Ul-Amin Shigehiko Kanaya Ken Tanaka Data Mining Methods for Omics and Knowledge of Crude Medicinal Plants toward Big Data Biology Computational and Structural Biotechnology Journal |
author_facet |
Farit M. Afendi Naoaki Ono Latifah K. Darusman Kensuke Nakamura Yukiko Nakamura Nelson Kibinge Aki Hirai Morita Hisayuki Horai Md. Altaf-Ul-Amin Shigehiko Kanaya Ken Tanaka |
author_sort |
Farit M. Afendi |
title |
Data Mining Methods for Omics and Knowledge of Crude Medicinal Plants toward Big Data Biology |
title_short |
Data Mining Methods for Omics and Knowledge of Crude Medicinal Plants toward Big Data Biology |
title_full |
Data Mining Methods for Omics and Knowledge of Crude Medicinal Plants toward Big Data Biology |
title_fullStr |
Data Mining Methods for Omics and Knowledge of Crude Medicinal Plants toward Big Data Biology |
title_full_unstemmed |
Data Mining Methods for Omics and Knowledge of Crude Medicinal Plants toward Big Data Biology |
title_sort |
data mining methods for omics and knowledge of crude medicinal plants toward big data biology |
publisher |
Elsevier |
series |
Computational and Structural Biotechnology Journal |
issn |
2001-0370 |
publishDate |
2013-01-01 |
description |
Molecular biological data has rapidly increased with the recent progress of the Omics fields, e.g., genomics, transcriptomics, proteomics and metabolomics that necessitates the development of databases and methods for efficient storage, retrieval, integration and analysis of massive data. The present study reviews the usage of KNApSAcK Family DB in metabolomics and related area, discusses several statistical methods for handling multivariate data and shows their application on Indonesian blended herbal medicines (Jamu) as a case study. Exploration using Biplot reveals many plants are rarely utilized while some plants are highly utilized toward specific efficacy. Furthermore, the ingredients of Jamu formulas are modeled using Partial Least Squares Discriminant Analysis (PLS-DA) in order to predict their efficacy. The plants used in each Jamu medicine served as the predictors, whereas the efficacy of each Jamu provided the responses. This model produces 71.6% correct classification in predicting efficacy. Permutation test then is used to determine plants that serve as main ingredients in Jamu formula by evaluating the significance of the PLS-DA coefficients. Next, in order to explain the role of plants that serve as main ingredients in Jamu medicines, information of pharmacological activity of the plants is added to the predictor block. Then N-PLS-DA model, multiway version of PLS-DA, is utilized to handle the three-dimensional array of the predictor block. The resulting N-PLS-DA model reveals that the effects of some pharmacological activities are specific for certain efficacy and the other activities are diverse toward many efficacies. Mathematical modeling introduced in the present study can be utilized in global analysis of big data targeting to reveal the underlying biology. |
url |
http://journals.sfu.ca/rncsb/index.php/csbj/article/view/csbj.201301010 |
work_keys_str_mv |
AT faritmafendi dataminingmethodsforomicsandknowledgeofcrudemedicinalplantstowardbigdatabiology AT naoakiono dataminingmethodsforomicsandknowledgeofcrudemedicinalplantstowardbigdatabiology AT latifahkdarusman dataminingmethodsforomicsandknowledgeofcrudemedicinalplantstowardbigdatabiology AT kensukenakamura dataminingmethodsforomicsandknowledgeofcrudemedicinalplantstowardbigdatabiology AT yukikonakamura dataminingmethodsforomicsandknowledgeofcrudemedicinalplantstowardbigdatabiology AT nelsonkibinge dataminingmethodsforomicsandknowledgeofcrudemedicinalplantstowardbigdatabiology AT akihiraimorita dataminingmethodsforomicsandknowledgeofcrudemedicinalplantstowardbigdatabiology AT hisayukihorai dataminingmethodsforomicsandknowledgeofcrudemedicinalplantstowardbigdatabiology AT mdaltafulamin dataminingmethodsforomicsandknowledgeofcrudemedicinalplantstowardbigdatabiology AT shigehikokanaya dataminingmethodsforomicsandknowledgeofcrudemedicinalplantstowardbigdatabiology AT kentanaka dataminingmethodsforomicsandknowledgeofcrudemedicinalplantstowardbigdatabiology |
_version_ |
1716794605559611392 |