A Novel Data Integration Framework Based on Unified Concept Model
Nowadays, data is being generated, collected, and analyzed at an unprecedented scale, data integration is the problem of combining data from heterogeneous, autonomous data sources, and providing users with a unified view of integrated data. To design a data integration framework, we need to address...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2017-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/7862127/ |
id |
doaj-58d0d28dd68049c899a7aa2de33f2ae8 |
---|---|
record_format |
Article |
spelling |
doaj-58d0d28dd68049c899a7aa2de33f2ae82021-03-29T20:08:28ZengIEEEIEEE Access2169-35362017-01-0155713572210.1109/ACCESS.2017.26728227862127A Novel Data Integration Framework Based on Unified Concept ModelBo Ma0https://orcid.org/0000-0003-3082-648XTonghai Jiang1Xi Zhou2Fan Zhao3Yating Yang4Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, ChinaXinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, ChinaXinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, ChinaXinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, ChinaXinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, ChinaNowadays, data is being generated, collected, and analyzed at an unprecedented scale, data integration is the problem of combining data from heterogeneous, autonomous data sources, and providing users with a unified view of integrated data. To design a data integration framework, we need to address challenges, such as schema mapping, data cleaning, record linkage, and data fusion. In this paper, we briefly introduce the traditional data integration approaches, and then, a novel graph-based data integration framework based on unified concept model (UCM) is proposed to address real-world refueling data integration problems. Within this framework, schema mapping was carried out and metadata from heterogeneous sources is integrated in a UCM. UCM has the benefits of being easy to update. It is also important for effective schema mapping and data transformation. By following the structure of UCM, data from different sources is automatically transformed into instance data and linked together by using semantic similarity computation metrics, finally the data is stored in graph database. Experiments are carried out based on heterogeneous data from refueling records, social networks of astroturfers, and vehicle trajectories. Experimental results and reference implementation demonstrations show good precision and recall of the proposed framework.https://ieeexplore.ieee.org/document/7862127/Data integrationschema mappinggraph modelsemantic similarity computation |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Bo Ma Tonghai Jiang Xi Zhou Fan Zhao Yating Yang |
spellingShingle |
Bo Ma Tonghai Jiang Xi Zhou Fan Zhao Yating Yang A Novel Data Integration Framework Based on Unified Concept Model IEEE Access Data integration schema mapping graph model semantic similarity computation |
author_facet |
Bo Ma Tonghai Jiang Xi Zhou Fan Zhao Yating Yang |
author_sort |
Bo Ma |
title |
A Novel Data Integration Framework Based on Unified Concept Model |
title_short |
A Novel Data Integration Framework Based on Unified Concept Model |
title_full |
A Novel Data Integration Framework Based on Unified Concept Model |
title_fullStr |
A Novel Data Integration Framework Based on Unified Concept Model |
title_full_unstemmed |
A Novel Data Integration Framework Based on Unified Concept Model |
title_sort |
novel data integration framework based on unified concept model |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2017-01-01 |
description |
Nowadays, data is being generated, collected, and analyzed at an unprecedented scale, data integration is the problem of combining data from heterogeneous, autonomous data sources, and providing users with a unified view of integrated data. To design a data integration framework, we need to address challenges, such as schema mapping, data cleaning, record linkage, and data fusion. In this paper, we briefly introduce the traditional data integration approaches, and then, a novel graph-based data integration framework based on unified concept model (UCM) is proposed to address real-world refueling data integration problems. Within this framework, schema mapping was carried out and metadata from heterogeneous sources is integrated in a UCM. UCM has the benefits of being easy to update. It is also important for effective schema mapping and data transformation. By following the structure of UCM, data from different sources is automatically transformed into instance data and linked together by using semantic similarity computation metrics, finally the data is stored in graph database. Experiments are carried out based on heterogeneous data from refueling records, social networks of astroturfers, and vehicle trajectories. Experimental results and reference implementation demonstrations show good precision and recall of the proposed framework. |
topic |
Data integration schema mapping graph model semantic similarity computation |
url |
https://ieeexplore.ieee.org/document/7862127/ |
work_keys_str_mv |
AT boma anoveldataintegrationframeworkbasedonunifiedconceptmodel AT tonghaijiang anoveldataintegrationframeworkbasedonunifiedconceptmodel AT xizhou anoveldataintegrationframeworkbasedonunifiedconceptmodel AT fanzhao anoveldataintegrationframeworkbasedonunifiedconceptmodel AT yatingyang anoveldataintegrationframeworkbasedonunifiedconceptmodel AT boma noveldataintegrationframeworkbasedonunifiedconceptmodel AT tonghaijiang noveldataintegrationframeworkbasedonunifiedconceptmodel AT xizhou noveldataintegrationframeworkbasedonunifiedconceptmodel AT fanzhao noveldataintegrationframeworkbasedonunifiedconceptmodel AT yatingyang noveldataintegrationframeworkbasedonunifiedconceptmodel |
_version_ |
1724195199819710464 |