An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigation

<p>Abstract</p> <p>Background</p> <p>For years, Uniform Resource Locator (URL) decay or "link rot" has been a growing concern in the field of biomedical sciences. This paper addresses this issue by examining the status of the URLs published in MEDLINE abstract...

Full description

Bibliographic Details
Main Authors: Fontelo Paul, Liu Fang, Ducut Erick
Format: Article
Language:English
Published: BMC 2008-06-01
Series:BMC Medical Informatics and Decision Making
Online Access:http://www.biomedcentral.com/1472-6947/8/23
id doaj-aaaa806441b14707834a0d692254d7f0
record_format Article
spelling doaj-aaaa806441b14707834a0d692254d7f02020-11-25T00:44:41ZengBMCBMC Medical Informatics and Decision Making1472-69472008-06-01812310.1186/1472-6947-8-23An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigationFontelo PaulLiu FangDucut Erick<p>Abstract</p> <p>Background</p> <p>For years, Uniform Resource Locator (URL) decay or "link rot" has been a growing concern in the field of biomedical sciences. This paper addresses this issue by examining the status of the URLs published in MEDLINE abstracts, establishing current availability and estimating URL decay in these records from 1994 to 2006. We also reviewed the information provided by the URL to determine if the context that the author cited in writing the paper is the same information presently available in the URL. Lastly, with all the documented recommended methods to preserve URL links, we determined which among them have gained acceptance among authors and publishers.</p> <p>Methods</p> <p>MEDLINE records from 1994 to 2006 from the National Library of Medicine in Extensible Mark-up Language (XML) format were processed yielding 10,208 URL addresses. These were accessed once daily at random times for 30 days. Titles and abstracts were also searched for the presence of archival tools such as WebCite, Persistent URL (PURL) and Digital Object Identifier (DOI).</p> <p>Results</p> <p>Results showed that the average URL length ranged from 13 to 425 characters with a mean length of 35 characters [Standard Deviation (SD) = 13.51; 95% confidence interval (CI) 13.25 to 13.77]. The most common top-level domains were ".org" and ".edu", each with 34%. About 81% of the URL pool was available 90% to 100% of the time, but only 78% of these contained the actual information mentioned in the MEDLINE record. "Dead" URLs constituted 16% of the total. Finally, a survey of archival tool usage showed that since its introduction in 1998, only 519 of all abstracts reviewed had incorporated DOI addresses in their MEDLINE abstracts.</p> <p>Conclusion</p> <p>URL persistence parallels previous studies which showed approximately 81% general availability during the 1-month study period. As peer-reviewed literature remains to be the main source of information in biomedicine, we need to ensure the accuracy and preservation of these links.</p> http://www.biomedcentral.com/1472-6947/8/23
collection DOAJ
language English
format Article
sources DOAJ
author Fontelo Paul
Liu Fang
Ducut Erick
spellingShingle Fontelo Paul
Liu Fang
Ducut Erick
An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigation
BMC Medical Informatics and Decision Making
author_facet Fontelo Paul
Liu Fang
Ducut Erick
author_sort Fontelo Paul
title An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigation
title_short An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigation
title_full An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigation
title_fullStr An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigation
title_full_unstemmed An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigation
title_sort update on uniform resource locator (url) decay in medline abstracts and measures for its mitigation
publisher BMC
series BMC Medical Informatics and Decision Making
issn 1472-6947
publishDate 2008-06-01
description <p>Abstract</p> <p>Background</p> <p>For years, Uniform Resource Locator (URL) decay or "link rot" has been a growing concern in the field of biomedical sciences. This paper addresses this issue by examining the status of the URLs published in MEDLINE abstracts, establishing current availability and estimating URL decay in these records from 1994 to 2006. We also reviewed the information provided by the URL to determine if the context that the author cited in writing the paper is the same information presently available in the URL. Lastly, with all the documented recommended methods to preserve URL links, we determined which among them have gained acceptance among authors and publishers.</p> <p>Methods</p> <p>MEDLINE records from 1994 to 2006 from the National Library of Medicine in Extensible Mark-up Language (XML) format were processed yielding 10,208 URL addresses. These were accessed once daily at random times for 30 days. Titles and abstracts were also searched for the presence of archival tools such as WebCite, Persistent URL (PURL) and Digital Object Identifier (DOI).</p> <p>Results</p> <p>Results showed that the average URL length ranged from 13 to 425 characters with a mean length of 35 characters [Standard Deviation (SD) = 13.51; 95% confidence interval (CI) 13.25 to 13.77]. The most common top-level domains were ".org" and ".edu", each with 34%. About 81% of the URL pool was available 90% to 100% of the time, but only 78% of these contained the actual information mentioned in the MEDLINE record. "Dead" URLs constituted 16% of the total. Finally, a survey of archival tool usage showed that since its introduction in 1998, only 519 of all abstracts reviewed had incorporated DOI addresses in their MEDLINE abstracts.</p> <p>Conclusion</p> <p>URL persistence parallels previous studies which showed approximately 81% general availability during the 1-month study period. As peer-reviewed literature remains to be the main source of information in biomedicine, we need to ensure the accuracy and preservation of these links.</p>
url http://www.biomedcentral.com/1472-6947/8/23
work_keys_str_mv AT fontelopaul anupdateonuniformresourcelocatorurldecayinmedlineabstractsandmeasuresforitsmitigation
AT liufang anupdateonuniformresourcelocatorurldecayinmedlineabstractsandmeasuresforitsmitigation
AT ducuterick anupdateonuniformresourcelocatorurldecayinmedlineabstractsandmeasuresforitsmitigation
AT fontelopaul updateonuniformresourcelocatorurldecayinmedlineabstractsandmeasuresforitsmitigation
AT liufang updateonuniformresourcelocatorurldecayinmedlineabstractsandmeasuresforitsmitigation
AT ducuterick updateonuniformresourcelocatorurldecayinmedlineabstractsandmeasuresforitsmitigation
_version_ 1725274034630819840