An approach for semantic integration of heterogeneous data sources

Integrating data from multiple heterogeneous data sources entails dealing with data distributed among heterogeneous information sources, which can be structured, semi-structured or unstructured, and providing the user with a unified view of these data. Thus, in general, gathering information is chal...

Full description

Bibliographic Details
Main Authors: Giuseppe Fusco, Lerina Aversano
Format: Article
Language:English
Published: PeerJ Inc. 2020-03-01
Series:PeerJ Computer Science
Subjects:
Online Access:https://peerj.com/articles/cs-254.pdf
id doaj-1acdb41b7b9540dd957b01a986ffd220
record_format Article
spelling doaj-1acdb41b7b9540dd957b01a986ffd2202020-11-25T00:18:42ZengPeerJ Inc.PeerJ Computer Science2376-59922020-03-016e25410.7717/peerj-cs.254An approach for semantic integration of heterogeneous data sourcesGiuseppe Fusco0Lerina Aversano1Department of Engineering, University of Sannio, Benevento, BN, ItaliaDepartment of Engineering, University of Sannio, Benevento, BN, ItaliaIntegrating data from multiple heterogeneous data sources entails dealing with data distributed among heterogeneous information sources, which can be structured, semi-structured or unstructured, and providing the user with a unified view of these data. Thus, in general, gathering information is challenging, and one of the main reasons is that data sources are designed to support specific applications. Very often their structure is unknown to the large part of users. Moreover, the stored data is often redundant, mixed with information only needed to support enterprise processes, and incomplete with respect to the business domain. Collecting, integrating, reconciling and efficiently extracting information from heterogeneous and autonomous data sources is regarded as a major challenge. In this paper, we present an approach for the semantic integration of heterogeneous data sources, DIF (Data Integration Framework), and a software prototype to support all aspects of a complex data integration process. The proposed approach is an ontology-based generalization of both Global-as-View and Local-as-View approaches. In particular, to overcome problems due to semantic heterogeneity and to support interoperability with external systems, ontologies are used as a conceptual schema to represent both data sources to be integrated and the global view.https://peerj.com/articles/cs-254.pdfData integrationHeterogeneous data sourcesSemantic integrationOntologies
collection DOAJ
language English
format Article
sources DOAJ
author Giuseppe Fusco
Lerina Aversano
spellingShingle Giuseppe Fusco
Lerina Aversano
An approach for semantic integration of heterogeneous data sources
PeerJ Computer Science
Data integration
Heterogeneous data sources
Semantic integration
Ontologies
author_facet Giuseppe Fusco
Lerina Aversano
author_sort Giuseppe Fusco
title An approach for semantic integration of heterogeneous data sources
title_short An approach for semantic integration of heterogeneous data sources
title_full An approach for semantic integration of heterogeneous data sources
title_fullStr An approach for semantic integration of heterogeneous data sources
title_full_unstemmed An approach for semantic integration of heterogeneous data sources
title_sort approach for semantic integration of heterogeneous data sources
publisher PeerJ Inc.
series PeerJ Computer Science
issn 2376-5992
publishDate 2020-03-01
description Integrating data from multiple heterogeneous data sources entails dealing with data distributed among heterogeneous information sources, which can be structured, semi-structured or unstructured, and providing the user with a unified view of these data. Thus, in general, gathering information is challenging, and one of the main reasons is that data sources are designed to support specific applications. Very often their structure is unknown to the large part of users. Moreover, the stored data is often redundant, mixed with information only needed to support enterprise processes, and incomplete with respect to the business domain. Collecting, integrating, reconciling and efficiently extracting information from heterogeneous and autonomous data sources is regarded as a major challenge. In this paper, we present an approach for the semantic integration of heterogeneous data sources, DIF (Data Integration Framework), and a software prototype to support all aspects of a complex data integration process. The proposed approach is an ontology-based generalization of both Global-as-View and Local-as-View approaches. In particular, to overcome problems due to semantic heterogeneity and to support interoperability with external systems, ontologies are used as a conceptual schema to represent both data sources to be integrated and the global view.
topic Data integration
Heterogeneous data sources
Semantic integration
Ontologies
url https://peerj.com/articles/cs-254.pdf
work_keys_str_mv AT giuseppefusco anapproachforsemanticintegrationofheterogeneousdatasources
AT lerinaaversano anapproachforsemanticintegrationofheterogeneousdatasources
AT giuseppefusco approachforsemanticintegrationofheterogeneousdatasources
AT lerinaaversano approachforsemanticintegrationofheterogeneousdatasources
_version_ 1725375109148966912