IMADS 2.0: an extended iMADS for MADS-box gene classification in orchid by machine learning approach

碩士 === 國立中興大學 === 基因體暨生物資訊學研究所 === 103 === Plant MIKC-type MADS-box transcription factors play an important role in controlling floral organ development. ABCDE model is an essential model to describe how angiosperm MADS-box genes regulate floral organ identity. Phylogenetic tree is the most common m...

Full description

Bibliographic Details
Main Authors: Kuan-Chun Chen, 陳冠群
Other Authors: Yen-Wei Chu
Format: Others
Language:zh-TW
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/91440902228590407851
id ndltd-TW-103NCHU5105039
record_format oai_dc
spelling ndltd-TW-103NCHU51050392016-08-15T04:17:49Z http://ndltd.ncl.edu.tw/handle/91440902228590407851 IMADS 2.0: an extended iMADS for MADS-box gene classification in orchid by machine learning approach IMADS 2.0:建構延伸型機器學習方法改進iMADS在蘭科之MADS-box基因預測 Kuan-Chun Chen 陳冠群 碩士 國立中興大學 基因體暨生物資訊學研究所 103 Plant MIKC-type MADS-box transcription factors play an important role in controlling floral organ development. ABCDE model is an essential model to describe how angiosperm MADS-box genes regulate floral organ identity. Phylogenetic tree is the most common method for gene classification. However, in the previous study we find that when phylogenetic tree faces to massive, multi-species or incomplete sequences, it might lead to waste of time and error classification. NCB lab developed a web-based tool for angiosperm MADS-box gene classification by machine learning method, iMADS. However, the training dataset of the system was old, and didn’t treat with appropriate filtration. On the other hand, the five-class ABCDE model also cannot have a better description for the species which contain unique floral organ. In this study, we use phylogenetic analysis to group data by unsupervised clustering. The error classification data are modified by literatures. All of the genes which specifically express on the floral organ will be select as training dataset. The training model is constructed by two-stage. In addition to trying various features, we also extended five-class ABCDE model to eight-class, and constructed multiple prediction models, iMADS2.0 according to MADS-box gene domain characteristic by support vector machines. From the resulst, BLAST can get the best accuracy than other features of BindN and COILS. The datasets from the independent and the error classification of phylogenetic tree are submitted to prediction model for performance evaluation. The results showed that it could not only upgrade the prediction accuracy but correct every sequence to proper class. Finally, we used bioinformatics tools to discuss the relationship between physiochemical property of C-terminal domain and the regulation mechanism of transcription activation region. iMADS2.0 provides MADS-box gene predicted classification, other most similar predicted sequences and visualized expression patterns according to the region by user input. The web-based tool is freely available at http://predictor.nchu.edu.tw/iMADS2. Yen-Wei Chu 朱彥煒 2015 學位論文 ; thesis 47 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立中興大學 === 基因體暨生物資訊學研究所 === 103 === Plant MIKC-type MADS-box transcription factors play an important role in controlling floral organ development. ABCDE model is an essential model to describe how angiosperm MADS-box genes regulate floral organ identity. Phylogenetic tree is the most common method for gene classification. However, in the previous study we find that when phylogenetic tree faces to massive, multi-species or incomplete sequences, it might lead to waste of time and error classification. NCB lab developed a web-based tool for angiosperm MADS-box gene classification by machine learning method, iMADS. However, the training dataset of the system was old, and didn’t treat with appropriate filtration. On the other hand, the five-class ABCDE model also cannot have a better description for the species which contain unique floral organ. In this study, we use phylogenetic analysis to group data by unsupervised clustering. The error classification data are modified by literatures. All of the genes which specifically express on the floral organ will be select as training dataset. The training model is constructed by two-stage. In addition to trying various features, we also extended five-class ABCDE model to eight-class, and constructed multiple prediction models, iMADS2.0 according to MADS-box gene domain characteristic by support vector machines. From the resulst, BLAST can get the best accuracy than other features of BindN and COILS. The datasets from the independent and the error classification of phylogenetic tree are submitted to prediction model for performance evaluation. The results showed that it could not only upgrade the prediction accuracy but correct every sequence to proper class. Finally, we used bioinformatics tools to discuss the relationship between physiochemical property of C-terminal domain and the regulation mechanism of transcription activation region. iMADS2.0 provides MADS-box gene predicted classification, other most similar predicted sequences and visualized expression patterns according to the region by user input. The web-based tool is freely available at http://predictor.nchu.edu.tw/iMADS2.
author2 Yen-Wei Chu
author_facet Yen-Wei Chu
Kuan-Chun Chen
陳冠群
author Kuan-Chun Chen
陳冠群
spellingShingle Kuan-Chun Chen
陳冠群
IMADS 2.0: an extended iMADS for MADS-box gene classification in orchid by machine learning approach
author_sort Kuan-Chun Chen
title IMADS 2.0: an extended iMADS for MADS-box gene classification in orchid by machine learning approach
title_short IMADS 2.0: an extended iMADS for MADS-box gene classification in orchid by machine learning approach
title_full IMADS 2.0: an extended iMADS for MADS-box gene classification in orchid by machine learning approach
title_fullStr IMADS 2.0: an extended iMADS for MADS-box gene classification in orchid by machine learning approach
title_full_unstemmed IMADS 2.0: an extended iMADS for MADS-box gene classification in orchid by machine learning approach
title_sort imads 2.0: an extended imads for mads-box gene classification in orchid by machine learning approach
publishDate 2015
url http://ndltd.ncl.edu.tw/handle/91440902228590407851
work_keys_str_mv AT kuanchunchen imads20anextendedimadsformadsboxgeneclassificationinorchidbymachinelearningapproach
AT chénguānqún imads20anextendedimadsformadsboxgeneclassificationinorchidbymachinelearningapproach
AT kuanchunchen imads20jiàngòuyánshēnxíngjīqìxuéxífāngfǎgǎijìnimadszàilánkēzhīmadsboxjīyīnyùcè
AT chénguānqún imads20jiàngòuyánshēnxíngjīqìxuéxífāngfǎgǎijìnimadszàilánkēzhīmadsboxjīyīnyùcè
_version_ 1718376548798562304