Application of BERT to Enable Gene Classification Based on Clinical Evidence

The identification of profiled cancer-related genes plays an essential role in cancer diagnosis and treatment. Based on literature research, the classification of genetic mutations continues to be done manually nowadays. Manual classification of genetic mutations is pathologist-dependent, subjective...

Full description

Bibliographic Details
Main Authors: Yuhan Su, Hongxin Xiang, Haotian Xie, Yong Yu, Shiyan Dong, Zhaogang Yang, Na Zhao
Format: Article
Language:English
Published: Hindawi Limited 2020-01-01
Series:BioMed Research International
Online Access:http://dx.doi.org/10.1155/2020/5491963
id doaj-6948e128ab7a4f75b6946eac8ce3bbdf
record_format Article
spelling doaj-6948e128ab7a4f75b6946eac8ce3bbdf2020-11-25T03:44:58ZengHindawi LimitedBioMed Research International2314-61332314-61412020-01-01202010.1155/2020/54919635491963Application of BERT to Enable Gene Classification Based on Clinical EvidenceYuhan Su0Hongxin Xiang1Haotian Xie2Yong Yu3Shiyan Dong4Zhaogang Yang5Na Zhao6National Pilot School of Software, Yunnan University, Kunming, 650091, ChinaNational Pilot School of Software, Yunnan University, Kunming, 650091, ChinaDepartment of Mathematics, The Ohio State University, Columbus, OH 43210, USANational Pilot School of Software, Yunnan University, Kunming, 650091, ChinaDepartment of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USADepartment of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USANational Pilot School of Software, Yunnan University, Kunming, 650091, ChinaThe identification of profiled cancer-related genes plays an essential role in cancer diagnosis and treatment. Based on literature research, the classification of genetic mutations continues to be done manually nowadays. Manual classification of genetic mutations is pathologist-dependent, subjective, and time-consuming. To improve the accuracy of clinical interpretation, scientists have proposed computational-based approaches for automatic analysis of mutations with the advent of next-generation sequencing technologies. Nevertheless, some challenges, such as multiple classifications, the complexity of texts, redundant descriptions, and inconsistent interpretation, have limited the development of algorithms. To overcome these difficulties, we have adapted a deep learning method named Bidirectional Encoder Representations from Transformers (BERT) to classify genetic mutations based on text evidence from an annotated database. During the training, three challenging features such as the extreme length of texts, biased data presentation, and high repeatability were addressed. Finally, the BERT+abstract demonstrates satisfactory results with 0.80 logarithmic loss, 0.6837 recall, and 0.705 F-measure. It is feasible for BERT to classify the genomic mutation text within literature-based datasets. Consequently, BERT is a practical tool for facilitating and significantly speeding up cancer research towards tumor progression, diagnosis, and the design of more precise and effective treatments.http://dx.doi.org/10.1155/2020/5491963
collection DOAJ
language English
format Article
sources DOAJ
author Yuhan Su
Hongxin Xiang
Haotian Xie
Yong Yu
Shiyan Dong
Zhaogang Yang
Na Zhao
spellingShingle Yuhan Su
Hongxin Xiang
Haotian Xie
Yong Yu
Shiyan Dong
Zhaogang Yang
Na Zhao
Application of BERT to Enable Gene Classification Based on Clinical Evidence
BioMed Research International
author_facet Yuhan Su
Hongxin Xiang
Haotian Xie
Yong Yu
Shiyan Dong
Zhaogang Yang
Na Zhao
author_sort Yuhan Su
title Application of BERT to Enable Gene Classification Based on Clinical Evidence
title_short Application of BERT to Enable Gene Classification Based on Clinical Evidence
title_full Application of BERT to Enable Gene Classification Based on Clinical Evidence
title_fullStr Application of BERT to Enable Gene Classification Based on Clinical Evidence
title_full_unstemmed Application of BERT to Enable Gene Classification Based on Clinical Evidence
title_sort application of bert to enable gene classification based on clinical evidence
publisher Hindawi Limited
series BioMed Research International
issn 2314-6133
2314-6141
publishDate 2020-01-01
description The identification of profiled cancer-related genes plays an essential role in cancer diagnosis and treatment. Based on literature research, the classification of genetic mutations continues to be done manually nowadays. Manual classification of genetic mutations is pathologist-dependent, subjective, and time-consuming. To improve the accuracy of clinical interpretation, scientists have proposed computational-based approaches for automatic analysis of mutations with the advent of next-generation sequencing technologies. Nevertheless, some challenges, such as multiple classifications, the complexity of texts, redundant descriptions, and inconsistent interpretation, have limited the development of algorithms. To overcome these difficulties, we have adapted a deep learning method named Bidirectional Encoder Representations from Transformers (BERT) to classify genetic mutations based on text evidence from an annotated database. During the training, three challenging features such as the extreme length of texts, biased data presentation, and high repeatability were addressed. Finally, the BERT+abstract demonstrates satisfactory results with 0.80 logarithmic loss, 0.6837 recall, and 0.705 F-measure. It is feasible for BERT to classify the genomic mutation text within literature-based datasets. Consequently, BERT is a practical tool for facilitating and significantly speeding up cancer research towards tumor progression, diagnosis, and the design of more precise and effective treatments.
url http://dx.doi.org/10.1155/2020/5491963
work_keys_str_mv AT yuhansu applicationofberttoenablegeneclassificationbasedonclinicalevidence
AT hongxinxiang applicationofberttoenablegeneclassificationbasedonclinicalevidence
AT haotianxie applicationofberttoenablegeneclassificationbasedonclinicalevidence
AT yongyu applicationofberttoenablegeneclassificationbasedonclinicalevidence
AT shiyandong applicationofberttoenablegeneclassificationbasedonclinicalevidence
AT zhaogangyang applicationofberttoenablegeneclassificationbasedonclinicalevidence
AT nazhao applicationofberttoenablegeneclassificationbasedonclinicalevidence
_version_ 1715126814320361472