Multilingual Open Information Extraction: Challenges and Opportunities

The number of documents published on the Web in languages other than English grows every year. As a consequence, the need to extract useful information from different languages increases, highlighting the importance of research into Open Information Extraction (OIE) techniques. Different OIE methods...

Full description

Bibliographic Details
Main Authors: Daniela Barreiro Claro, Marlo Souza, Clarissa Castellã Xavier, Leandro Oliveira
Format: Article
Language:English
Published: MDPI AG 2019-07-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/10/7/228
id doaj-91ff4ad0fe11444bad53c5e575849e4d
record_format Article
spelling doaj-91ff4ad0fe11444bad53c5e575849e4d2020-11-24T22:07:24ZengMDPI AGInformation2078-24892019-07-0110722810.3390/info10070228info10070228Multilingual Open Information Extraction: Challenges and OpportunitiesDaniela Barreiro Claro0Marlo Souza1Clarissa Castellã Xavier2Leandro Oliveira3FORMAS Research Group, Computer Science Department, Federal University of Bahia, Salvador - BA 40170-110, BrazilFORMAS Research Group, Computer Science Department, Federal University of Bahia, Salvador - BA 40170-110, BrazilFORMAS Research Group, Federal Institute of Rio Grande do Sul, Porto Alegre - RS 90030-040, BrazilFORMAS Research Group, Computer Science Department, Federal University of Bahia, Salvador - BA 40170-110, BrazilThe number of documents published on the Web in languages other than English grows every year. As a consequence, the need to extract useful information from different languages increases, highlighting the importance of research into Open Information Extraction (OIE) techniques. Different OIE methods have dealt with features from a unique language; however, few approaches tackle multilingual aspects. In those approaches, multilingualism is restricted to processing text in different languages, rather than exploring cross-linguistic resources, which results in low precision due to the use of general rules. Multilingual methods have been applied to numerous problems in Natural Language Processing, achieving satisfactory results and demonstrating that knowledge acquisition for a language can be transferred to other languages to improve the quality of the facts extracted. We argue that a multilingual approach can enhance OIE methods as it is ideal to evaluate and compare OIE systems, and therefore can be applied to the collected facts. In this work, we discuss how the transfer knowledge between languages can increase acquisition from multilingual approaches. We provide a roadmap of the Multilingual Open IE area concerning state of the art studies. Additionally, we evaluate the transfer of knowledge to improve the quality of the facts extracted in each language. Moreover, we discuss the importance of a parallel corpus to evaluate and compare multilingual systems.https://www.mdpi.com/2078-2489/10/7/228multilingualopen information extractionparallel corpus
collection DOAJ
language English
format Article
sources DOAJ
author Daniela Barreiro Claro
Marlo Souza
Clarissa Castellã Xavier
Leandro Oliveira
spellingShingle Daniela Barreiro Claro
Marlo Souza
Clarissa Castellã Xavier
Leandro Oliveira
Multilingual Open Information Extraction: Challenges and Opportunities
Information
multilingual
open information extraction
parallel corpus
author_facet Daniela Barreiro Claro
Marlo Souza
Clarissa Castellã Xavier
Leandro Oliveira
author_sort Daniela Barreiro Claro
title Multilingual Open Information Extraction: Challenges and Opportunities
title_short Multilingual Open Information Extraction: Challenges and Opportunities
title_full Multilingual Open Information Extraction: Challenges and Opportunities
title_fullStr Multilingual Open Information Extraction: Challenges and Opportunities
title_full_unstemmed Multilingual Open Information Extraction: Challenges and Opportunities
title_sort multilingual open information extraction: challenges and opportunities
publisher MDPI AG
series Information
issn 2078-2489
publishDate 2019-07-01
description The number of documents published on the Web in languages other than English grows every year. As a consequence, the need to extract useful information from different languages increases, highlighting the importance of research into Open Information Extraction (OIE) techniques. Different OIE methods have dealt with features from a unique language; however, few approaches tackle multilingual aspects. In those approaches, multilingualism is restricted to processing text in different languages, rather than exploring cross-linguistic resources, which results in low precision due to the use of general rules. Multilingual methods have been applied to numerous problems in Natural Language Processing, achieving satisfactory results and demonstrating that knowledge acquisition for a language can be transferred to other languages to improve the quality of the facts extracted. We argue that a multilingual approach can enhance OIE methods as it is ideal to evaluate and compare OIE systems, and therefore can be applied to the collected facts. In this work, we discuss how the transfer knowledge between languages can increase acquisition from multilingual approaches. We provide a roadmap of the Multilingual Open IE area concerning state of the art studies. Additionally, we evaluate the transfer of knowledge to improve the quality of the facts extracted in each language. Moreover, we discuss the importance of a parallel corpus to evaluate and compare multilingual systems.
topic multilingual
open information extraction
parallel corpus
url https://www.mdpi.com/2078-2489/10/7/228
work_keys_str_mv AT danielabarreiroclaro multilingualopeninformationextractionchallengesandopportunities
AT marlosouza multilingualopeninformationextractionchallengesandopportunities
AT clarissacastellaxavier multilingualopeninformationextractionchallengesandopportunities
AT leandrooliveira multilingualopeninformationextractionchallengesandopportunities
_version_ 1725820665347440640