Data Extraction Based on Page Structure Analysis

The information we need has some confusing problems such as dispersion and different organizational structure. In addition, because of the existence of unstructured data like natural language and images, extracting local content pages is extremely difficult. In the light of of the problems above, th...

Full description

Bibliographic Details
Main Authors: Ren Yichao, Tian Jiayin
Format: Article
Language:English
Published: EDP Sciences 2017-01-01
Series:MATEC Web of Conferences
Subjects:
Online Access:https://doi.org/10.1051/matecconf/201713900118