A Novel Data Integration Framework Based on Unified Concept Model

Nowadays, data is being generated, collected, and analyzed at an unprecedented scale, data integration is the problem of combining data from heterogeneous, autonomous data sources, and providing users with a unified view of integrated data. To design a data integration framework, we need to address...

Full description

Bibliographic Details
Main Authors: Bo Ma, Tonghai Jiang, Xi Zhou, Fan Zhao, Yating Yang
Format: Article
Language:English
Published: IEEE 2017-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/7862127/
id doaj-58d0d28dd68049c899a7aa2de33f2ae8
record_format Article
spelling doaj-58d0d28dd68049c899a7aa2de33f2ae82021-03-29T20:08:28ZengIEEEIEEE Access2169-35362017-01-0155713572210.1109/ACCESS.2017.26728227862127A Novel Data Integration Framework Based on Unified Concept ModelBo Ma0https://orcid.org/0000-0003-3082-648XTonghai Jiang1Xi Zhou2Fan Zhao3Yating Yang4Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, ChinaXinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, ChinaXinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, ChinaXinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, ChinaXinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, ChinaNowadays, data is being generated, collected, and analyzed at an unprecedented scale, data integration is the problem of combining data from heterogeneous, autonomous data sources, and providing users with a unified view of integrated data. To design a data integration framework, we need to address challenges, such as schema mapping, data cleaning, record linkage, and data fusion. In this paper, we briefly introduce the traditional data integration approaches, and then, a novel graph-based data integration framework based on unified concept model (UCM) is proposed to address real-world refueling data integration problems. Within this framework, schema mapping was carried out and metadata from heterogeneous sources is integrated in a UCM. UCM has the benefits of being easy to update. It is also important for effective schema mapping and data transformation. By following the structure of UCM, data from different sources is automatically transformed into instance data and linked together by using semantic similarity computation metrics, finally the data is stored in graph database. Experiments are carried out based on heterogeneous data from refueling records, social networks of astroturfers, and vehicle trajectories. Experimental results and reference implementation demonstrations show good precision and recall of the proposed framework.https://ieeexplore.ieee.org/document/7862127/Data integrationschema mappinggraph modelsemantic similarity computation
collection DOAJ
language English
format Article
sources DOAJ
author Bo Ma
Tonghai Jiang
Xi Zhou
Fan Zhao
Yating Yang
spellingShingle Bo Ma
Tonghai Jiang
Xi Zhou
Fan Zhao
Yating Yang
A Novel Data Integration Framework Based on Unified Concept Model
IEEE Access
Data integration
schema mapping
graph model
semantic similarity computation
author_facet Bo Ma
Tonghai Jiang
Xi Zhou
Fan Zhao
Yating Yang
author_sort Bo Ma
title A Novel Data Integration Framework Based on Unified Concept Model
title_short A Novel Data Integration Framework Based on Unified Concept Model
title_full A Novel Data Integration Framework Based on Unified Concept Model
title_fullStr A Novel Data Integration Framework Based on Unified Concept Model
title_full_unstemmed A Novel Data Integration Framework Based on Unified Concept Model
title_sort novel data integration framework based on unified concept model
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2017-01-01
description Nowadays, data is being generated, collected, and analyzed at an unprecedented scale, data integration is the problem of combining data from heterogeneous, autonomous data sources, and providing users with a unified view of integrated data. To design a data integration framework, we need to address challenges, such as schema mapping, data cleaning, record linkage, and data fusion. In this paper, we briefly introduce the traditional data integration approaches, and then, a novel graph-based data integration framework based on unified concept model (UCM) is proposed to address real-world refueling data integration problems. Within this framework, schema mapping was carried out and metadata from heterogeneous sources is integrated in a UCM. UCM has the benefits of being easy to update. It is also important for effective schema mapping and data transformation. By following the structure of UCM, data from different sources is automatically transformed into instance data and linked together by using semantic similarity computation metrics, finally the data is stored in graph database. Experiments are carried out based on heterogeneous data from refueling records, social networks of astroturfers, and vehicle trajectories. Experimental results and reference implementation demonstrations show good precision and recall of the proposed framework.
topic Data integration
schema mapping
graph model
semantic similarity computation
url https://ieeexplore.ieee.org/document/7862127/
work_keys_str_mv AT boma anoveldataintegrationframeworkbasedonunifiedconceptmodel
AT tonghaijiang anoveldataintegrationframeworkbasedonunifiedconceptmodel
AT xizhou anoveldataintegrationframeworkbasedonunifiedconceptmodel
AT fanzhao anoveldataintegrationframeworkbasedonunifiedconceptmodel
AT yatingyang anoveldataintegrationframeworkbasedonunifiedconceptmodel
AT boma noveldataintegrationframeworkbasedonunifiedconceptmodel
AT tonghaijiang noveldataintegrationframeworkbasedonunifiedconceptmodel
AT xizhou noveldataintegrationframeworkbasedonunifiedconceptmodel
AT fanzhao noveldataintegrationframeworkbasedonunifiedconceptmodel
AT yatingyang noveldataintegrationframeworkbasedonunifiedconceptmodel
_version_ 1724195199819710464