SyntaxGym: An Online Platform for Targeted Evaluation of Language Models

Targeted syntactic evaluations have yielded insights into the generalizations learned by neural network language models. However, this line of research requires an uncommon confluence of skills: both the theoretical knowledge needed to design controlled psycholinguistic experiments, and the technica...

Full description

Bibliographic Details
Main Authors: Gauthier, Jon (Author), Hu, Jennifer (Author), Wilcox, Ethan (Author), Qian, Peng (Author), Levy, Roger (Author)
Format: Article
Language:English
Published: Association for Computational Linguistics (ACL), 2021-12-01T17:49:30Z.
Subjects:
Online Access:Get fulltext
LEADER 01764 am a22002053u 4500
001 138281
042 |a dc 
100 1 0 |a Gauthier, Jon  |e author 
700 1 0 |a Hu, Jennifer  |e author 
700 1 0 |a Wilcox, Ethan  |e author 
700 1 0 |a Qian, Peng  |e author 
700 1 0 |a Levy, Roger  |e author 
245 0 0 |a SyntaxGym: An Online Platform for Targeted Evaluation of Language Models 
260 |b Association for Computational Linguistics (ACL),   |c 2021-12-01T17:49:30Z. 
856 |z Get fulltext  |u https://hdl.handle.net/1721.1/138281 
520 |a Targeted syntactic evaluations have yielded insights into the generalizations learned by neural network language models. However, this line of research requires an uncommon confluence of skills: both the theoretical knowledge needed to design controlled psycholinguistic experiments, and the technical proficiency needed to train and deploy large-scale language models. We present SyntaxGym, an online platform designed to make targeted evaluations accessible to both experts in NLP and linguistics, reproducible across computing environments, and standardized following the norms of psycholinguistic experimental design. This paper releases two tools of independent value for the computational linguistics community: 1. A website, syntaxgym.org, which centralizes the process of targeted syntactic evaluation and provides easy tools for analysis and visualization; 2. Two command-line tools, syntaxgym and lm-zoo, which allow any user to reproduce targeted syntactic evaluations and general language model inference on their own machine. 
546 |a en 
655 7 |a Article 
773 |t 10.18653/V1/2020.ACL-DEMOS.10 
773 |t Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations