CBAG: Conditional biomedical abstract generation.
Biomedical research papers often combine disjoint concepts in novel ways, such as when describing a newly discovered relationship between an understudied gene with an important disease. These concepts are often explicitly encoded as metadata keywords, such as the author-provided terms included with...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2021-01-01
|
Series: | PLoS ONE |
Online Access: | https://doi.org/10.1371/journal.pone.0253905 |
id |
doaj-74f0df3208804bc1b0c9d625465a318b |
---|---|
record_format |
Article |
spelling |
doaj-74f0df3208804bc1b0c9d625465a318b2021-07-22T04:30:28ZengPublic Library of Science (PLoS)PLoS ONE1932-62032021-01-01167e025390510.1371/journal.pone.0253905CBAG: Conditional biomedical abstract generation.Justin SybrandtIlya SafroBiomedical research papers often combine disjoint concepts in novel ways, such as when describing a newly discovered relationship between an understudied gene with an important disease. These concepts are often explicitly encoded as metadata keywords, such as the author-provided terms included with many documents in the MEDLINE database. While substantial recent work has addressed the problem of text generation in a more general context, applications, such as scientific writing assistants, or hypothesis generation systems, could benefit from the capacity to select the specific set of concepts that underpin a generated biomedical text. We propose a conditional language model following the transformer architecture. This model uses the "encoder stack" to encode concepts that a user wishes to discuss in the generated text. The "decoder stack" then follows the masked self-attention pattern to perform text generation, using both prior tokens as well as the encoded condition. We demonstrate that this approach provides significant control, while still producing reasonable biomedical text.https://doi.org/10.1371/journal.pone.0253905 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Justin Sybrandt Ilya Safro |
spellingShingle |
Justin Sybrandt Ilya Safro CBAG: Conditional biomedical abstract generation. PLoS ONE |
author_facet |
Justin Sybrandt Ilya Safro |
author_sort |
Justin Sybrandt |
title |
CBAG: Conditional biomedical abstract generation. |
title_short |
CBAG: Conditional biomedical abstract generation. |
title_full |
CBAG: Conditional biomedical abstract generation. |
title_fullStr |
CBAG: Conditional biomedical abstract generation. |
title_full_unstemmed |
CBAG: Conditional biomedical abstract generation. |
title_sort |
cbag: conditional biomedical abstract generation. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2021-01-01 |
description |
Biomedical research papers often combine disjoint concepts in novel ways, such as when describing a newly discovered relationship between an understudied gene with an important disease. These concepts are often explicitly encoded as metadata keywords, such as the author-provided terms included with many documents in the MEDLINE database. While substantial recent work has addressed the problem of text generation in a more general context, applications, such as scientific writing assistants, or hypothesis generation systems, could benefit from the capacity to select the specific set of concepts that underpin a generated biomedical text. We propose a conditional language model following the transformer architecture. This model uses the "encoder stack" to encode concepts that a user wishes to discuss in the generated text. The "decoder stack" then follows the masked self-attention pattern to perform text generation, using both prior tokens as well as the encoded condition. We demonstrate that this approach provides significant control, while still producing reasonable biomedical text. |
url |
https://doi.org/10.1371/journal.pone.0253905 |
work_keys_str_mv |
AT justinsybrandt cbagconditionalbiomedicalabstractgeneration AT ilyasafro cbagconditionalbiomedicalabstractgeneration |
_version_ |
1721292193970257920 |