A method for the automated, reliable retrieval of publication-citation records.

BACKGROUND: Publication records and citation indices often are used to evaluate academic performance. For this reason, obtaining or computing them accurately is important. This can be difficult, largely due to a lack of complete knowledge of an individual's publication list and/or lack of time...

Full description

Bibliographic Details
Main Authors: Derek Ruths, Faiyaz Al Zamal
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2010-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC2924380?pdf=render
id doaj-01500706324d4f1f99b76e0a91c39e7a
record_format Article
spelling doaj-01500706324d4f1f99b76e0a91c39e7a2020-11-25T02:22:00ZengPublic Library of Science (PLoS)PLoS ONE1932-62032010-01-0158e1213310.1371/journal.pone.0012133A method for the automated, reliable retrieval of publication-citation records.Derek RuthsFaiyaz Al ZamalBACKGROUND: Publication records and citation indices often are used to evaluate academic performance. For this reason, obtaining or computing them accurately is important. This can be difficult, largely due to a lack of complete knowledge of an individual's publication list and/or lack of time available to manually obtain or construct the publication-citation record. While online publication search engines have somewhat addressed these problems, using raw search results can yield inaccurate estimates of publication-citation records and citation indices. METHODOLOGY: In this paper, we present a new, automated method that produces estimates of an individual's publication-citation record from an individual's name and a set of domain-specific vocabulary that may occur in the individual's publication titles. Because this vocabulary can be harvested directly from a research web page or online (partial) publication list, our method delivers an easy way to obtain estimates of a publication-citation record and the relevant citation indices. Our method works by applying a series of stringent name and content filters to the raw publication search results returned by an online publication search engine. In this paper, our method is run using Google Scholar, but the underlying filters can be easily applied to any existing publication search engine. When compared against a manually constructed data set of individuals and their publication-citation records, our method provides significant improvements over raw search results. The estimated publication-citation records returned by our method have an average sensitivity of 98% and specificity of 72% (in contrast to raw search result specificity of less than 10%). When citation indices are computed using these records, the estimated indices are within of the true value 10%, compared to raw search results which have overestimates of, on average, 75%. CONCLUSIONS: These results confirm that our method provides significantly improved estimates over raw search results, and these can either be used directly for large-scale (departmental or university) analysis or further refined manually to quickly give accurate publication-citation records.http://europepmc.org/articles/PMC2924380?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Derek Ruths
Faiyaz Al Zamal
spellingShingle Derek Ruths
Faiyaz Al Zamal
A method for the automated, reliable retrieval of publication-citation records.
PLoS ONE
author_facet Derek Ruths
Faiyaz Al Zamal
author_sort Derek Ruths
title A method for the automated, reliable retrieval of publication-citation records.
title_short A method for the automated, reliable retrieval of publication-citation records.
title_full A method for the automated, reliable retrieval of publication-citation records.
title_fullStr A method for the automated, reliable retrieval of publication-citation records.
title_full_unstemmed A method for the automated, reliable retrieval of publication-citation records.
title_sort method for the automated, reliable retrieval of publication-citation records.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2010-01-01
description BACKGROUND: Publication records and citation indices often are used to evaluate academic performance. For this reason, obtaining or computing them accurately is important. This can be difficult, largely due to a lack of complete knowledge of an individual's publication list and/or lack of time available to manually obtain or construct the publication-citation record. While online publication search engines have somewhat addressed these problems, using raw search results can yield inaccurate estimates of publication-citation records and citation indices. METHODOLOGY: In this paper, we present a new, automated method that produces estimates of an individual's publication-citation record from an individual's name and a set of domain-specific vocabulary that may occur in the individual's publication titles. Because this vocabulary can be harvested directly from a research web page or online (partial) publication list, our method delivers an easy way to obtain estimates of a publication-citation record and the relevant citation indices. Our method works by applying a series of stringent name and content filters to the raw publication search results returned by an online publication search engine. In this paper, our method is run using Google Scholar, but the underlying filters can be easily applied to any existing publication search engine. When compared against a manually constructed data set of individuals and their publication-citation records, our method provides significant improvements over raw search results. The estimated publication-citation records returned by our method have an average sensitivity of 98% and specificity of 72% (in contrast to raw search result specificity of less than 10%). When citation indices are computed using these records, the estimated indices are within of the true value 10%, compared to raw search results which have overestimates of, on average, 75%. CONCLUSIONS: These results confirm that our method provides significantly improved estimates over raw search results, and these can either be used directly for large-scale (departmental or university) analysis or further refined manually to quickly give accurate publication-citation records.
url http://europepmc.org/articles/PMC2924380?pdf=render
work_keys_str_mv AT derekruths amethodfortheautomatedreliableretrievalofpublicationcitationrecords
AT faiyazalzamal amethodfortheautomatedreliableretrievalofpublicationcitationrecords
AT derekruths methodfortheautomatedreliableretrievalofpublicationcitationrecords
AT faiyazalzamal methodfortheautomatedreliableretrievalofpublicationcitationrecords
_version_ 1724864069645631488