Data Fusion in Big Data Analytics: a Knowledge Graph Approach

博士 === 元智大學 === 資訊工程學系 === 104 === Traditional data fusion methods are inadequate in big data analytics. Knowledge graph aiming to describe the various entities, concepts and their linkages. It can capture and present the semantic relations between concepts in the domain and overcome the semantic he...

Full description

Bibliographic Details
Main Authors: Kai-Biao Lin, 林開標
Other Authors: K. Robert Lai
Format: Others
Language:en_US
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/mwzrfq
id ndltd-TW-104YZU05392022
record_format oai_dc
spelling ndltd-TW-104YZU053920222019-05-15T22:53:47Z http://ndltd.ncl.edu.tw/handle/mwzrfq Data Fusion in Big Data Analytics: a Knowledge Graph Approach 數據融合:以知識圖譜為基 Kai-Biao Lin 林開標 博士 元智大學 資訊工程學系 104 Traditional data fusion methods are inadequate in big data analytics. Knowledge graph aiming to describe the various entities, concepts and their linkages. It can capture and present the semantic relations between concepts in the domain and overcome the semantic heterogeneity of big data. Numerous large-scale universal knowledge graphs had already been applied in many big data scenarios. Thus, this dissertation proposed a knowledge graph-based approach for data fusion in big data analytics. First, a construction of domain ontologies for each specific is given, and then, a construction of global ontology on the basis of domain ontology to provide the consistency checking and redundancy removal for knowledge graph fusion is presented. After that, a construction of the domain knowledge graph, which focused on solving the problem of entity and relation extraction based on ontology is developed. Eventually, a fusion of various domain knowledge graphs into a general knowledge graph on the processes of similarity detection, entity alignment, conflict resolution, relation redirection and data migration concludes this approach. This dissertation proposed a constraint based embedding model TransC to complete the knowledge graph. It added the semantic-type constraints for each relation while constructing corrupted triplets to excluded the triplets that didn't conform to semantic constraints. The experimental results showed that TransC not only maintained the simple parameters and fast training speed of TransE, but also improved the prediction accuracy. To verify the practicality and effectiveness of the proposed framework, this dissertation applied data fusion framework to construct a general knowledge graph from the medical, environmental and meteorological data, and established the interrelation for the graph effectively. In addition, this research developed a knowledge graph management platform to provide a consistent access interface and a unified view for the heterogeneous data sources, user could make the operations of advanced search, statistic, analytic, and visualization applications. K. Robert Lai Chien-Lung Chan 賴國華 詹前隆 2016 學位論文 ; thesis 154 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 元智大學 === 資訊工程學系 === 104 === Traditional data fusion methods are inadequate in big data analytics. Knowledge graph aiming to describe the various entities, concepts and their linkages. It can capture and present the semantic relations between concepts in the domain and overcome the semantic heterogeneity of big data. Numerous large-scale universal knowledge graphs had already been applied in many big data scenarios. Thus, this dissertation proposed a knowledge graph-based approach for data fusion in big data analytics. First, a construction of domain ontologies for each specific is given, and then, a construction of global ontology on the basis of domain ontology to provide the consistency checking and redundancy removal for knowledge graph fusion is presented. After that, a construction of the domain knowledge graph, which focused on solving the problem of entity and relation extraction based on ontology is developed. Eventually, a fusion of various domain knowledge graphs into a general knowledge graph on the processes of similarity detection, entity alignment, conflict resolution, relation redirection and data migration concludes this approach. This dissertation proposed a constraint based embedding model TransC to complete the knowledge graph. It added the semantic-type constraints for each relation while constructing corrupted triplets to excluded the triplets that didn't conform to semantic constraints. The experimental results showed that TransC not only maintained the simple parameters and fast training speed of TransE, but also improved the prediction accuracy. To verify the practicality and effectiveness of the proposed framework, this dissertation applied data fusion framework to construct a general knowledge graph from the medical, environmental and meteorological data, and established the interrelation for the graph effectively. In addition, this research developed a knowledge graph management platform to provide a consistent access interface and a unified view for the heterogeneous data sources, user could make the operations of advanced search, statistic, analytic, and visualization applications.
author2 K. Robert Lai
author_facet K. Robert Lai
Kai-Biao Lin
林開標
author Kai-Biao Lin
林開標
spellingShingle Kai-Biao Lin
林開標
Data Fusion in Big Data Analytics: a Knowledge Graph Approach
author_sort Kai-Biao Lin
title Data Fusion in Big Data Analytics: a Knowledge Graph Approach
title_short Data Fusion in Big Data Analytics: a Knowledge Graph Approach
title_full Data Fusion in Big Data Analytics: a Knowledge Graph Approach
title_fullStr Data Fusion in Big Data Analytics: a Knowledge Graph Approach
title_full_unstemmed Data Fusion in Big Data Analytics: a Knowledge Graph Approach
title_sort data fusion in big data analytics: a knowledge graph approach
publishDate 2016
url http://ndltd.ncl.edu.tw/handle/mwzrfq
work_keys_str_mv AT kaibiaolin datafusioninbigdataanalyticsaknowledgegraphapproach
AT línkāibiāo datafusioninbigdataanalyticsaknowledgegraphapproach
AT kaibiaolin shùjùrónghéyǐzhīshítúpǔwèijī
AT línkāibiāo shùjùrónghéyǐzhīshítúpǔwèijī
_version_ 1719137105647501312