A Study of an Effect Soft Clustering Approach to Mining Gene Expressions from Multi-Source Databases

碩士 === 國立臺南大學 === 資訊教育研究所碩士班 === 94 === With the growth of biological technology, enormous biological databases formed useful data warehouses, such as Microarray data, biomedical literatures, sequence data, and genome structure data et al. In recent years, a hot issue of bioinformatics is mining hid...

Full description

Bibliographic Details
Main Authors: Hsiu-min Chuang, 莊秀敏
Other Authors: Chien-I Lee
Format: Others
Language:zh-TW
Published: 2006
Online Access:http://ndltd.ncl.edu.tw/handle/20295146071817100481
id ndltd-TW-094NTNT5395022
record_format oai_dc
spelling ndltd-TW-094NTNT53950222015-10-13T14:49:03Z http://ndltd.ncl.edu.tw/handle/20295146071817100481 A Study of an Effect Soft Clustering Approach to Mining Gene Expressions from Multi-Source Databases 利用軟分群方法從多源資料庫中有效探勘基因表現之研究 Hsiu-min Chuang 莊秀敏 碩士 國立臺南大學 資訊教育研究所碩士班 94 With the growth of biological technology, enormous biological databases formed useful data warehouses, such as Microarray data, biomedical literatures, sequence data, and genome structure data et al. In recent years, a hot issue of bioinformatics is mining hidden and meaningful information from heterogeneous data. The goal of mining is to reach a higher accuracy than single dataset and predict the gene-gene relations and genetic networks in advance. Multi-Sources Clustering (MSC) is an important and representative approach for mining multi-sources; however, MSC does not consider the problem that genes may have multi-functions and involve several biological pathways. MSC also ignores that the properties and accuracy of heterogeneous data might be different. In this study, we propose the Multi-Source Soft Clustering (MSSC) by using fuzzy c-means and soft CAST to solve the problem. MSSC adopts the concept of clustering before integrating to improve the overall accuracy, and uses the correlation coefficient to calculate the distance between different soft clustering. Finally, as shown in the experiments, MSSC performs more accurately than MSC in both general and specific cases. Chien-I Lee 李建億 2006 學位論文 ; thesis 67 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺南大學 === 資訊教育研究所碩士班 === 94 === With the growth of biological technology, enormous biological databases formed useful data warehouses, such as Microarray data, biomedical literatures, sequence data, and genome structure data et al. In recent years, a hot issue of bioinformatics is mining hidden and meaningful information from heterogeneous data. The goal of mining is to reach a higher accuracy than single dataset and predict the gene-gene relations and genetic networks in advance. Multi-Sources Clustering (MSC) is an important and representative approach for mining multi-sources; however, MSC does not consider the problem that genes may have multi-functions and involve several biological pathways. MSC also ignores that the properties and accuracy of heterogeneous data might be different. In this study, we propose the Multi-Source Soft Clustering (MSSC) by using fuzzy c-means and soft CAST to solve the problem. MSSC adopts the concept of clustering before integrating to improve the overall accuracy, and uses the correlation coefficient to calculate the distance between different soft clustering. Finally, as shown in the experiments, MSSC performs more accurately than MSC in both general and specific cases.
author2 Chien-I Lee
author_facet Chien-I Lee
Hsiu-min Chuang
莊秀敏
author Hsiu-min Chuang
莊秀敏
spellingShingle Hsiu-min Chuang
莊秀敏
A Study of an Effect Soft Clustering Approach to Mining Gene Expressions from Multi-Source Databases
author_sort Hsiu-min Chuang
title A Study of an Effect Soft Clustering Approach to Mining Gene Expressions from Multi-Source Databases
title_short A Study of an Effect Soft Clustering Approach to Mining Gene Expressions from Multi-Source Databases
title_full A Study of an Effect Soft Clustering Approach to Mining Gene Expressions from Multi-Source Databases
title_fullStr A Study of an Effect Soft Clustering Approach to Mining Gene Expressions from Multi-Source Databases
title_full_unstemmed A Study of an Effect Soft Clustering Approach to Mining Gene Expressions from Multi-Source Databases
title_sort study of an effect soft clustering approach to mining gene expressions from multi-source databases
publishDate 2006
url http://ndltd.ncl.edu.tw/handle/20295146071817100481
work_keys_str_mv AT hsiuminchuang astudyofaneffectsoftclusteringapproachtomininggeneexpressionsfrommultisourcedatabases
AT zhuāngxiùmǐn astudyofaneffectsoftclusteringapproachtomininggeneexpressionsfrommultisourcedatabases
AT hsiuminchuang lìyòngruǎnfēnqúnfāngfǎcóngduōyuánzīliàokùzhōngyǒuxiàotànkānjīyīnbiǎoxiànzhīyánjiū
AT zhuāngxiùmǐn lìyòngruǎnfēnqúnfāngfǎcóngduōyuánzīliàokùzhōngyǒuxiàotànkānjīyīnbiǎoxiànzhīyánjiū
AT hsiuminchuang studyofaneffectsoftclusteringapproachtomininggeneexpressionsfrommultisourcedatabases
AT zhuāngxiùmǐn studyofaneffectsoftclusteringapproachtomininggeneexpressionsfrommultisourcedatabases
_version_ 1717757407126355968