A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature

The biomedical literature represents a rich source of biomarker information. However, both the size of literature databases and their lack of standardization hamper the automatic exploitation of the information contained in these resources. Text mining approaches have proven to be useful for the exp...

Full description

Bibliographic Details
Main Authors: À. Bravo, M. Cases, N. Queralt-Rosinach, F. Sanz, L. I. Furlong
Format: Article
Language:English
Published: Hindawi Limited 2014-01-01
Series:BioMed Research International
Online Access:http://dx.doi.org/10.1155/2014/253128
id doaj-c39d9cd4ccb246c8a43b88bf552a65e1
record_format Article
spelling doaj-c39d9cd4ccb246c8a43b88bf552a65e12020-11-25T00:22:38ZengHindawi LimitedBioMed Research International2314-61332314-61412014-01-01201410.1155/2014/253128253128A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the LiteratureÀ. Bravo0M. Cases1N. Queralt-Rosinach2F. Sanz3L. I. Furlong4Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, SpainResearch Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, SpainResearch Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, SpainResearch Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, SpainResearch Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, SpainThe biomedical literature represents a rich source of biomarker information. However, both the size of literature databases and their lack of standardization hamper the automatic exploitation of the information contained in these resources. Text mining approaches have proven to be useful for the exploitation of information contained in the scientific publications. Here, we show that a knowledge-driven text mining approach can exploit a large literature database to extract a dataset of biomarkers related to diseases covering all therapeutic areas. Our methodology takes advantage of the annotation of MEDLINE publications pertaining to biomarkers with MeSH terms, narrowing the search to specific publications and, therefore, minimizing the false positive ratio. It is based on a dictionary-based named entity recognition system and a relation extraction module. The application of this methodology resulted in the identification of 131,012 disease-biomarker associations between 2,803 genes and 2,751 diseases, and represents a valuable knowledge base for those interested in disease-related biomarkers. Additionally, we present a bibliometric analysis of the journals reporting biomarker related information during the last 40 years.http://dx.doi.org/10.1155/2014/253128
collection DOAJ
language English
format Article
sources DOAJ
author À. Bravo
M. Cases
N. Queralt-Rosinach
F. Sanz
L. I. Furlong
spellingShingle À. Bravo
M. Cases
N. Queralt-Rosinach
F. Sanz
L. I. Furlong
A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature
BioMed Research International
author_facet À. Bravo
M. Cases
N. Queralt-Rosinach
F. Sanz
L. I. Furlong
author_sort À. Bravo
title A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature
title_short A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature
title_full A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature
title_fullStr A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature
title_full_unstemmed A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature
title_sort knowledge-driven approach to extract disease-related biomarkers from the literature
publisher Hindawi Limited
series BioMed Research International
issn 2314-6133
2314-6141
publishDate 2014-01-01
description The biomedical literature represents a rich source of biomarker information. However, both the size of literature databases and their lack of standardization hamper the automatic exploitation of the information contained in these resources. Text mining approaches have proven to be useful for the exploitation of information contained in the scientific publications. Here, we show that a knowledge-driven text mining approach can exploit a large literature database to extract a dataset of biomarkers related to diseases covering all therapeutic areas. Our methodology takes advantage of the annotation of MEDLINE publications pertaining to biomarkers with MeSH terms, narrowing the search to specific publications and, therefore, minimizing the false positive ratio. It is based on a dictionary-based named entity recognition system and a relation extraction module. The application of this methodology resulted in the identification of 131,012 disease-biomarker associations between 2,803 genes and 2,751 diseases, and represents a valuable knowledge base for those interested in disease-related biomarkers. Additionally, we present a bibliometric analysis of the journals reporting biomarker related information during the last 40 years.
url http://dx.doi.org/10.1155/2014/253128
work_keys_str_mv AT abravo aknowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT mcases aknowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT nqueraltrosinach aknowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT fsanz aknowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT lifurlong aknowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT abravo knowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT mcases knowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT nqueraltrosinach knowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT fsanz knowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
AT lifurlong knowledgedrivenapproachtoextractdiseaserelatedbiomarkersfromtheliterature
_version_ 1725359130008354816