Dependency parsing of biomedical text with BERT

Abstract Background: Syntactic analysis, or parsing, is a key task in natural language processing and a required component for many text mining approaches. In recent years, Universal Dependencies (UD) has emerged as the leading formalism for dependency parsing. While a number of recent tasks center...

Full description

Bibliographic Details
Main Authors:	Jenna Kanerva, Filip Ginter, Sampo Pyysalo
Format:	Article
Language:	English
Published:	BMC 2020-12-01
Series:	BMC Bioinformatics
Subjects:	Parsing Deep learning CRAFT
Online Access:	https://doi.org/10.1186/s12859-020-03905-8

id	doaj-980e92cad093490ab14fb2d2b804c910
record_format	Article
spelling	doaj-980e92cad093490ab14fb2d2b804c9102021-01-03T12:21:18ZengBMCBMC Bioinformatics1471-21052020-12-0121S2311210.1186/s12859-020-03905-8Dependency parsing of biomedical text with BERTJenna Kanerva0Filip Ginter1Sampo Pyysalo2TurkuNLP Group, University of TurkuTurkuNLP Group, University of TurkuTurkuNLP Group, University of TurkuAbstract Background: Syntactic analysis, or parsing, is a key task in natural language processing and a required component for many text mining approaches. In recent years, Universal Dependencies (UD) has emerged as the leading formalism for dependency parsing. While a number of recent tasks centering on UD have substantially advanced the state of the art in multilingual parsing, there has been only little study of parsing texts from specialized domains such as biomedicine. Methods: We explore the application of state-of-the-art neural dependency parsing methods to biomedical text using the recently introduced CRAFT-SA shared task dataset. The CRAFT-SA task broadly follows the UD representation and recent UD task conventions, allowing us to fine-tune the UD-compatible Turku Neural Parser and UDify neural parsers to the task. We further evaluate the effect of transfer learning using a broad selection of BERT models, including several models pre-trained specifically for biomedical text processing. Results: We find that recently introduced neural parsing technology is capable of generating highly accurate analyses of biomedical text, substantially improving on the best performance reported in the original CRAFT-SA shared task. We also find that initialization using a deep transfer learning model pre-trained on in-domain texts is key to maximizing the performance of the parsing methods.https://doi.org/10.1186/s12859-020-03905-8ParsingDeep learningCRAFT
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Jenna Kanerva Filip Ginter Sampo Pyysalo
spellingShingle	Jenna Kanerva Filip Ginter Sampo Pyysalo Dependency parsing of biomedical text with BERT BMC Bioinformatics Parsing Deep learning CRAFT
author_facet	Jenna Kanerva Filip Ginter Sampo Pyysalo
author_sort	Jenna Kanerva
title	Dependency parsing of biomedical text with BERT
title_short	Dependency parsing of biomedical text with BERT
title_full	Dependency parsing of biomedical text with BERT
title_fullStr	Dependency parsing of biomedical text with BERT
title_full_unstemmed	Dependency parsing of biomedical text with BERT
title_sort	dependency parsing of biomedical text with bert
publisher	BMC
series	BMC Bioinformatics
issn	1471-2105
publishDate	2020-12-01
description	Abstract Background: Syntactic analysis, or parsing, is a key task in natural language processing and a required component for many text mining approaches. In recent years, Universal Dependencies (UD) has emerged as the leading formalism for dependency parsing. While a number of recent tasks centering on UD have substantially advanced the state of the art in multilingual parsing, there has been only little study of parsing texts from specialized domains such as biomedicine. Methods: We explore the application of state-of-the-art neural dependency parsing methods to biomedical text using the recently introduced CRAFT-SA shared task dataset. The CRAFT-SA task broadly follows the UD representation and recent UD task conventions, allowing us to fine-tune the UD-compatible Turku Neural Parser and UDify neural parsers to the task. We further evaluate the effect of transfer learning using a broad selection of BERT models, including several models pre-trained specifically for biomedical text processing. Results: We find that recently introduced neural parsing technology is capable of generating highly accurate analyses of biomedical text, substantially improving on the best performance reported in the original CRAFT-SA shared task. We also find that initialization using a deep transfer learning model pre-trained on in-domain texts is key to maximizing the performance of the parsing methods.
topic	Parsing Deep learning CRAFT
url	https://doi.org/10.1186/s12859-020-03905-8
work_keys_str_mv	AT jennakanerva dependencyparsingofbiomedicaltextwithbert AT filipginter dependencyparsingofbiomedicaltextwithbert AT sampopyysalo dependencyparsingofbiomedicaltextwithbert
_version_	1724350363216117760

Dependency parsing of biomedical text with BERT

Similar Items