Mining Diversified Decision Trees Across Multiple Datasets to Capture Similarities and Alignable Differences

Bibliographic Details
Main Author: Han, Qian
Language:English
Published: Wright State University / OhioLINK 2013
Subjects:
Online Access:http://rave.ohiolink.edu/etdc/view?acc_num=wright1376511749
id ndltd-OhioLink-oai-etd.ohiolink.edu-wright1376511749
record_format oai_dc
spelling ndltd-OhioLink-oai-etd.ohiolink.edu-wright13765117492021-08-03T06:19:10Z Mining Diversified Decision Trees Across Multiple Datasets to Capture Similarities and Alignable Differences Han, Qian Computer Science This dissertation studies the problem of mining shared and alignable difference knowledge structures across multiple datasets/applications. Shared and alignable difference knowledge structures are important for identifying analogies between application domains and for forming new hypothesis in challenging research applications, and for assessing the degree and types of knowledge-level similarities and differences between application domains for use in learning transfer.Generally speaking, shared knowledge structures characterize underlying datasets and highlight conceptual-level structural similarities among the datasets. This dissertation studies the mining of shared decision trees, which are a special type of shared knowledge structures. We first consider building one shared decision tree with high classification accuracy and high data distribution similarity for two given datasets. Moreover, it is observed that one shared decision tree may only present a limited view of shared behaviors between two given datasets. In order to help users to select from multiple diversified perspectives on shared knowledge structures, we propose the diversified decision tree set mining problem, whose goal is to mine a small set of k diversified high quality shared decision trees. Besides requiring each tree in the set to have high classification accuracy and highly similar data distributions in the given datasets, different trees in the set are also required to behighly different from each other. Algorithms are developed to solve both problems. Experimental results on microarray datasets for medicine are reported to evaluate the algorithms, together with the mined shared decision trees.This dissertation also introduces and studies the mining of alignable differences. Roughly speaking, alignable difference knowledge structures indicate significant differences in the context of a large amount of similarities among two given datasets. This dissertation considers alignable differences in the form of cross-domain decision trees. An algorithm to solve this problem is presented. Experimental results on microarray datasets for medicine are reported to evaluate the algorithm. 2013-08-16 English text Wright State University / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=wright1376511749 http://rave.ohiolink.edu/etdc/view?acc_num=wright1376511749 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws.
collection NDLTD
language English
sources NDLTD
topic Computer Science
spellingShingle Computer Science
Han, Qian
Mining Diversified Decision Trees Across Multiple Datasets to Capture Similarities and Alignable Differences
author Han, Qian
author_facet Han, Qian
author_sort Han, Qian
title Mining Diversified Decision Trees Across Multiple Datasets to Capture Similarities and Alignable Differences
title_short Mining Diversified Decision Trees Across Multiple Datasets to Capture Similarities and Alignable Differences
title_full Mining Diversified Decision Trees Across Multiple Datasets to Capture Similarities and Alignable Differences
title_fullStr Mining Diversified Decision Trees Across Multiple Datasets to Capture Similarities and Alignable Differences
title_full_unstemmed Mining Diversified Decision Trees Across Multiple Datasets to Capture Similarities and Alignable Differences
title_sort mining diversified decision trees across multiple datasets to capture similarities and alignable differences
publisher Wright State University / OhioLINK
publishDate 2013
url http://rave.ohiolink.edu/etdc/view?acc_num=wright1376511749
work_keys_str_mv AT hanqian miningdiversifieddecisiontreesacrossmultipledatasetstocapturesimilaritiesandalignabledifferences
_version_ 1719434743390404608