External features enriched model for biomedical question answering

Abstract Background Biomedical question answering (QA) is a sub-task of natural language processing in a specific domain, which aims to answer a question in the biomedical field based on one or more related passages and can provide people with accurate healthcare-related information. Recently, a lot...

Full description

Bibliographic Details
Main Authors: Gezheng Xu, Wenge Rong, Yanmeng Wang, Yuanxin Ouyang, Zhang Xiong
Format: Article
Language:English
Published: BMC 2021-05-01
Series:BMC Bioinformatics
Subjects:
POS
NER
Online Access:https://doi.org/10.1186/s12859-021-04176-7
id doaj-94ce15823b0743489fc3b1d3a511d847
record_format Article
spelling doaj-94ce15823b0743489fc3b1d3a511d8472021-05-30T11:52:53ZengBMCBMC Bioinformatics1471-21052021-05-0122111910.1186/s12859-021-04176-7External features enriched model for biomedical question answeringGezheng Xu0Wenge Rong1Yanmeng Wang2Yuanxin Ouyang3Zhang Xiong4State Key Laboratory of Software Development Environment, Beihang UniversityState Key Laboratory of Software Development Environment, Beihang UniversityPing An TechnologyState Key Laboratory of Software Development Environment, Beihang UniversityState Key Laboratory of Software Development Environment, Beihang UniversityAbstract Background Biomedical question answering (QA) is a sub-task of natural language processing in a specific domain, which aims to answer a question in the biomedical field based on one or more related passages and can provide people with accurate healthcare-related information. Recently, a lot of approaches based on the neural network and large scale pre-trained language model have largely improved its performance. However, considering the lexical characteristics of biomedical corpus and its small scale dataset, there is still much improvement room for biomedical QA tasks. Results Inspired by the importance of syntactic and lexical features in the biomedical corpus, we proposed a new framework to extract external features, such as part-of-speech and named-entity recognition, and fused them with the original text representation encoded by pre-trained language model, to enhance the biomedical question answering performance. Our model achieves an overall improvement of all three metrics on BioASQ 6b, 7b, and 8b factoid question answering tasks. Conclusions The experiments on BioASQ question answering dataset demonstrated the effectiveness of our external feature-enriched framework. It is proven by the experiments conducted that external lexical and syntactic features can improve Pre-trained Language Model’s performance in biomedical domain question answering task.https://doi.org/10.1186/s12859-021-04176-7Biomedical question answeringFeature fusionPre-trained language modelPOSNER
collection DOAJ
language English
format Article
sources DOAJ
author Gezheng Xu
Wenge Rong
Yanmeng Wang
Yuanxin Ouyang
Zhang Xiong
spellingShingle Gezheng Xu
Wenge Rong
Yanmeng Wang
Yuanxin Ouyang
Zhang Xiong
External features enriched model for biomedical question answering
BMC Bioinformatics
Biomedical question answering
Feature fusion
Pre-trained language model
POS
NER
author_facet Gezheng Xu
Wenge Rong
Yanmeng Wang
Yuanxin Ouyang
Zhang Xiong
author_sort Gezheng Xu
title External features enriched model for biomedical question answering
title_short External features enriched model for biomedical question answering
title_full External features enriched model for biomedical question answering
title_fullStr External features enriched model for biomedical question answering
title_full_unstemmed External features enriched model for biomedical question answering
title_sort external features enriched model for biomedical question answering
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2021-05-01
description Abstract Background Biomedical question answering (QA) is a sub-task of natural language processing in a specific domain, which aims to answer a question in the biomedical field based on one or more related passages and can provide people with accurate healthcare-related information. Recently, a lot of approaches based on the neural network and large scale pre-trained language model have largely improved its performance. However, considering the lexical characteristics of biomedical corpus and its small scale dataset, there is still much improvement room for biomedical QA tasks. Results Inspired by the importance of syntactic and lexical features in the biomedical corpus, we proposed a new framework to extract external features, such as part-of-speech and named-entity recognition, and fused them with the original text representation encoded by pre-trained language model, to enhance the biomedical question answering performance. Our model achieves an overall improvement of all three metrics on BioASQ 6b, 7b, and 8b factoid question answering tasks. Conclusions The experiments on BioASQ question answering dataset demonstrated the effectiveness of our external feature-enriched framework. It is proven by the experiments conducted that external lexical and syntactic features can improve Pre-trained Language Model’s performance in biomedical domain question answering task.
topic Biomedical question answering
Feature fusion
Pre-trained language model
POS
NER
url https://doi.org/10.1186/s12859-021-04176-7
work_keys_str_mv AT gezhengxu externalfeaturesenrichedmodelforbiomedicalquestionanswering
AT wengerong externalfeaturesenrichedmodelforbiomedicalquestionanswering
AT yanmengwang externalfeaturesenrichedmodelforbiomedicalquestionanswering
AT yuanxinouyang externalfeaturesenrichedmodelforbiomedicalquestionanswering
AT zhangxiong externalfeaturesenrichedmodelforbiomedicalquestionanswering
_version_ 1721419946008772608