Deriving Machine Attention from Human Rationales

Attention-based models are successful when trained on large amounts of data. In this paper, we demonstrate that even in the low-resource scenario, attention can be learned effectively. To this end, we start with discrete human-annotated rationales and map them into continuous attention. Our central...

Full description

Bibliographic Details
Main Authors:	Bao, Yujia (Author), Chang, Shiyu (Author), Yu, Mo (Author), Barzilay, Regina (Author)
Other Authors:	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory (Contributor)
Format:	Article
Language:	English
Published:	Association for Computational Linguistics (ACL), 2021-02-09T22:36:48Z.
Subjects:	Article
Online Access:	Get fulltext

Description
Summary:	Attention-based models are successful when trained on large amounts of data. In this paper, we demonstrate that even in the low-resource scenario, attention can be learned effectively. To this end, we start with discrete human-annotated rationales and map them into continuous attention. Our central hypothesis is that this mapping is general across domains, and thus can be transferred from resource-rich domains to low-resource ones. Our model jointly learns a domain-invariant representation and induces the desired mapping between rationales and attention. Our empirical results validate this hypothesis and show that our approach delivers significant gains over state-of-the-art baselines, yielding over 15% average error reduction on benchmark datasets.

Deriving Machine Attention from Human Rationales

Similar Items