Ontology-based Feature Construction on Non-structured Data

Bibliographic Details
Main Author: Ni, Weizeng
Language:English
Published: University of Cincinnati / OhioLINK 2015
Subjects:
Online Access:http://rave.ohiolink.edu/etdc/view?acc_num=ucin1439309340
id ndltd-OhioLink-oai-etd.ohiolink.edu-ucin1439309340
record_format oai_dc
spelling ndltd-OhioLink-oai-etd.ohiolink.edu-ucin14393093402021-08-03T06:32:47Z Ontology-based Feature Construction on Non-structured Data Ni, Weizeng Engineering Feature construction ontology domain knowledge knowledge discovery Data mining on non-structured data is a relatively under-researched area because most efforts in the KDD community in the last decades are devoted to mining relational structured data. Thanks to the information explosion in the big-data era, the majority of knowledge is emerging in various forms of non-structured data. This necessitates new methodologies of constructing meaningful features from non-structured data to facilitate knowledge learning. Most existing data-driven methods only serve the objective of improving feature discriminative power, while severely underestimate the importance of interpretability. In many domains, the discovery and learning of new hypotheses and knowledge in a meaningful and understandable form from non-structured data is the prime aim. In this study, an ontology-based feature construction framework is proposed. The framework presents the structural relations embedded with domain knowledge in the form of ontology. Features of 3 levels are defined based on the granularity in ontology. A feature, representing a domain hypothesis, can be readily constructed by evolving ontology. Support and confidence are two criteria proposed to evaluate the usefulness of the features in support of searching for optimal ones. Furthermore, in an interactive way, domain experts are involved to explore new hypotheses with the aid of data-driven heuristic algorithms. Also, ontology is highly flexible to be reconstructed in order to accommodate different hypotheses. A comprehensive case study is conducted in which the proposed methodology is applied on a miscellaneous medical claim data to build features that are both interpretable and highly predictive for hospitalization forecast. A medical professor are constantly consulted to bring in domain insights to aid ontology evolution and assess meaningfulness of the constructed features and prediction. The constructed features outperform those based on initial hypotheses in terms of prediction accuracy. Moreover, the ability of discovering new and useful knowledge is demonstrated by the meaningfulness of the new features and evolved ontology. 2015-09-10 English text University of Cincinnati / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=ucin1439309340 http://rave.ohiolink.edu/etdc/view?acc_num=ucin1439309340 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws.
collection NDLTD
language English
sources NDLTD
topic Engineering
Feature construction
ontology
domain knowledge
knowledge discovery
spellingShingle Engineering
Feature construction
ontology
domain knowledge
knowledge discovery
Ni, Weizeng
Ontology-based Feature Construction on Non-structured Data
author Ni, Weizeng
author_facet Ni, Weizeng
author_sort Ni, Weizeng
title Ontology-based Feature Construction on Non-structured Data
title_short Ontology-based Feature Construction on Non-structured Data
title_full Ontology-based Feature Construction on Non-structured Data
title_fullStr Ontology-based Feature Construction on Non-structured Data
title_full_unstemmed Ontology-based Feature Construction on Non-structured Data
title_sort ontology-based feature construction on non-structured data
publisher University of Cincinnati / OhioLINK
publishDate 2015
url http://rave.ohiolink.edu/etdc/view?acc_num=ucin1439309340
work_keys_str_mv AT niweizeng ontologybasedfeatureconstructiononnonstructureddata
_version_ 1719438900423819264