An approach for semantic integration of heterogeneous data sources
Integrating data from multiple heterogeneous data sources entails dealing with data distributed among heterogeneous information sources, which can be structured, semi-structured or unstructured, and providing the user with a unified view of these data. Thus, in general, gathering information is chal...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
PeerJ Inc.
2020-03-01
|
Series: | PeerJ Computer Science |
Subjects: | |
Online Access: | https://peerj.com/articles/cs-254.pdf |
id |
doaj-1acdb41b7b9540dd957b01a986ffd220 |
---|---|
record_format |
Article |
spelling |
doaj-1acdb41b7b9540dd957b01a986ffd2202020-11-25T00:18:42ZengPeerJ Inc.PeerJ Computer Science2376-59922020-03-016e25410.7717/peerj-cs.254An approach for semantic integration of heterogeneous data sourcesGiuseppe Fusco0Lerina Aversano1Department of Engineering, University of Sannio, Benevento, BN, ItaliaDepartment of Engineering, University of Sannio, Benevento, BN, ItaliaIntegrating data from multiple heterogeneous data sources entails dealing with data distributed among heterogeneous information sources, which can be structured, semi-structured or unstructured, and providing the user with a unified view of these data. Thus, in general, gathering information is challenging, and one of the main reasons is that data sources are designed to support specific applications. Very often their structure is unknown to the large part of users. Moreover, the stored data is often redundant, mixed with information only needed to support enterprise processes, and incomplete with respect to the business domain. Collecting, integrating, reconciling and efficiently extracting information from heterogeneous and autonomous data sources is regarded as a major challenge. In this paper, we present an approach for the semantic integration of heterogeneous data sources, DIF (Data Integration Framework), and a software prototype to support all aspects of a complex data integration process. The proposed approach is an ontology-based generalization of both Global-as-View and Local-as-View approaches. In particular, to overcome problems due to semantic heterogeneity and to support interoperability with external systems, ontologies are used as a conceptual schema to represent both data sources to be integrated and the global view.https://peerj.com/articles/cs-254.pdfData integrationHeterogeneous data sourcesSemantic integrationOntologies |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Giuseppe Fusco Lerina Aversano |
spellingShingle |
Giuseppe Fusco Lerina Aversano An approach for semantic integration of heterogeneous data sources PeerJ Computer Science Data integration Heterogeneous data sources Semantic integration Ontologies |
author_facet |
Giuseppe Fusco Lerina Aversano |
author_sort |
Giuseppe Fusco |
title |
An approach for semantic integration of heterogeneous data sources |
title_short |
An approach for semantic integration of heterogeneous data sources |
title_full |
An approach for semantic integration of heterogeneous data sources |
title_fullStr |
An approach for semantic integration of heterogeneous data sources |
title_full_unstemmed |
An approach for semantic integration of heterogeneous data sources |
title_sort |
approach for semantic integration of heterogeneous data sources |
publisher |
PeerJ Inc. |
series |
PeerJ Computer Science |
issn |
2376-5992 |
publishDate |
2020-03-01 |
description |
Integrating data from multiple heterogeneous data sources entails dealing with data distributed among heterogeneous information sources, which can be structured, semi-structured or unstructured, and providing the user with a unified view of these data. Thus, in general, gathering information is challenging, and one of the main reasons is that data sources are designed to support specific applications. Very often their structure is unknown to the large part of users. Moreover, the stored data is often redundant, mixed with information only needed to support enterprise processes, and incomplete with respect to the business domain. Collecting, integrating, reconciling and efficiently extracting information from heterogeneous and autonomous data sources is regarded as a major challenge. In this paper, we present an approach for the semantic integration of heterogeneous data sources, DIF (Data Integration Framework), and a software prototype to support all aspects of a complex data integration process. The proposed approach is an ontology-based generalization of both Global-as-View and Local-as-View approaches. In particular, to overcome problems due to semantic heterogeneity and to support interoperability with external systems, ontologies are used as a conceptual schema to represent both data sources to be integrated and the global view. |
topic |
Data integration Heterogeneous data sources Semantic integration Ontologies |
url |
https://peerj.com/articles/cs-254.pdf |
work_keys_str_mv |
AT giuseppefusco anapproachforsemanticintegrationofheterogeneousdatasources AT lerinaaversano anapproachforsemanticintegrationofheterogeneousdatasources AT giuseppefusco approachforsemanticintegrationofheterogeneousdatasources AT lerinaaversano approachforsemanticintegrationofheterogeneousdatasources |
_version_ |
1725375109148966912 |