Multilingual Zero-Shot and Few-Shot Causality Detection

Relations that hold between causes and their effects are fundamental for a wide range of different sectors. Automatically finding sentences that express such relations may for example be of great interest for the economy or political institutions. However, for many languages other than English, a la...

Full description

Bibliographic Details
Main Author: Reimann, Sebastian Michael
Format: Others
Language:English
Published: Uppsala universitet, Institutionen för lingvistik och filologi 2021
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-446516
id ndltd-UPSALLA1-oai-DiVA.org-uu-446516
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-uu-4465162021-06-22T05:25:03ZMultilingual Zero-Shot and Few-Shot Causality DetectionengReimann, Sebastian MichaelUppsala universitet, Institutionen för lingvistik och filologi2021classificationcausalitycausal relationmultilingualcross-lingualzero-shotfew-shotbertxlm-rlaserLanguage Technology (Computational Linguistics)Språkteknologi (språkvetenskaplig databehandling)Relations that hold between causes and their effects are fundamental for a wide range of different sectors. Automatically finding sentences that express such relations may for example be of great interest for the economy or political institutions. However, for many languages other than English, a lack of training resources for this task needs to be dealt with. In recent years, large, pretrained transformer-based model architectures have proven to be very effective for tasks involving cross-lingual transfer such as cross-lingual language inference, as well as multilingual named entity recognition, POS-tagging and dependency parsing, which may hint at similar potentials for causality detection. In this thesis, we define causality detection as a binary labelling problem and use cross-lingual transfer to alleviate data scarcity for German and Swedish by using three different classifiers that make either use of multilingual sentence embeddings obtained from a pretrained encoder or pretrained multilingual language models. The source languages in most of our experiments will be English, for Swedish we however also use a small German training set and a combination of English and German training data.  We try out zero-shot transfer as well as making use of limited amounts of target language data either as a development set or as additional training data in a few-shot setting. In the latter scenario, we explore the impact of varying sizes of training data. Moreover, the problem of data scarcity in our situation also makes it necessary to work with data from different annotation projects. We also explore how much this would impact our result. For German as a target language, our results in a zero-shot scenario expectedly fall short in comparison with monolingual experiments, but F1-macro scores between 60 and 65 in cases where annotation did not differ drastically still signal that it was possible to transfer at least some knowledge. When introducing only small amounts of target language data, already notable improvements were observed and with the full German training data of about 3,000 sentences combined with the most suitable English data set, the performance for German in some scenarios even almost matches the state of the art for monolingual experiments on English. The best zero-shot performance on the Swedish data was even outperforming the scores achieved for German. However, due to problems with the additional Swedish training data, we were not able to improve upon the zero-shot performance in a few-shot setting in a similar manner as it was the case for German. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-446516application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic classification
causality
causal relation
multilingual
cross-lingual
zero-shot
few-shot
bert
xlm-r
laser
Language Technology (Computational Linguistics)
Språkteknologi (språkvetenskaplig databehandling)
spellingShingle classification
causality
causal relation
multilingual
cross-lingual
zero-shot
few-shot
bert
xlm-r
laser
Language Technology (Computational Linguistics)
Språkteknologi (språkvetenskaplig databehandling)
Reimann, Sebastian Michael
Multilingual Zero-Shot and Few-Shot Causality Detection
description Relations that hold between causes and their effects are fundamental for a wide range of different sectors. Automatically finding sentences that express such relations may for example be of great interest for the economy or political institutions. However, for many languages other than English, a lack of training resources for this task needs to be dealt with. In recent years, large, pretrained transformer-based model architectures have proven to be very effective for tasks involving cross-lingual transfer such as cross-lingual language inference, as well as multilingual named entity recognition, POS-tagging and dependency parsing, which may hint at similar potentials for causality detection. In this thesis, we define causality detection as a binary labelling problem and use cross-lingual transfer to alleviate data scarcity for German and Swedish by using three different classifiers that make either use of multilingual sentence embeddings obtained from a pretrained encoder or pretrained multilingual language models. The source languages in most of our experiments will be English, for Swedish we however also use a small German training set and a combination of English and German training data.  We try out zero-shot transfer as well as making use of limited amounts of target language data either as a development set or as additional training data in a few-shot setting. In the latter scenario, we explore the impact of varying sizes of training data. Moreover, the problem of data scarcity in our situation also makes it necessary to work with data from different annotation projects. We also explore how much this would impact our result. For German as a target language, our results in a zero-shot scenario expectedly fall short in comparison with monolingual experiments, but F1-macro scores between 60 and 65 in cases where annotation did not differ drastically still signal that it was possible to transfer at least some knowledge. When introducing only small amounts of target language data, already notable improvements were observed and with the full German training data of about 3,000 sentences combined with the most suitable English data set, the performance for German in some scenarios even almost matches the state of the art for monolingual experiments on English. The best zero-shot performance on the Swedish data was even outperforming the scores achieved for German. However, due to problems with the additional Swedish training data, we were not able to improve upon the zero-shot performance in a few-shot setting in a similar manner as it was the case for German.
author Reimann, Sebastian Michael
author_facet Reimann, Sebastian Michael
author_sort Reimann, Sebastian Michael
title Multilingual Zero-Shot and Few-Shot Causality Detection
title_short Multilingual Zero-Shot and Few-Shot Causality Detection
title_full Multilingual Zero-Shot and Few-Shot Causality Detection
title_fullStr Multilingual Zero-Shot and Few-Shot Causality Detection
title_full_unstemmed Multilingual Zero-Shot and Few-Shot Causality Detection
title_sort multilingual zero-shot and few-shot causality detection
publisher Uppsala universitet, Institutionen för lingvistik och filologi
publishDate 2021
url http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-446516
work_keys_str_mv AT reimannsebastianmichael multilingualzeroshotandfewshotcausalitydetection
_version_ 1719411549715562496