General Strategy for Querying Web Sources in a Data Federation Environment

Modern database management systems are supporting the inclusion and querying of non-relational sources within a data federation environment via wrappers. Wrapper development for Web sources, however, is a convolution of code with extraction and query planning knowledge and becomes a daunting task. W...

Full description

Bibliographic Details
Main Authors: Firat, Aykut (Author), Wu, Lynn (Contributor), Madnick, Stuart E. (Contributor)
Other Authors: Sloan School of Management (Contributor)
Format: Article
Language:English
Published: IGI Global, 2011-12-01T18:38:46Z.
Subjects:
Online Access:Get fulltext
Description
Summary:Modern database management systems are supporting the inclusion and querying of non-relational sources within a data federation environment via wrappers. Wrapper development for Web sources, however, is a convolution of code with extraction and query planning knowledge and becomes a daunting task. We use IBM DB2 federation engine to demonstrate the challenges of incorporating Web sources into a data federation. We, then, present a practical and general strategy for the inclusion and querying of Web sources without requiring any changes in the underlying data federation technology. This strategy separates the code and knowledge in wrapper development by introducing a general-purpose capabilities-aware mini query-planner and a data extraction engine. As a result, Web sources can be included in a data federation system faster, and maintained easier.