Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications

The effects of enhancing direct citations, with respect to publication–publication relatedness measurement, by indirect citation relations (bibliographic coupling, cocitation, and extended direct citations) and text relations on clustering solution accuracy are analyzed. For co...

Full description

Bibliographic Details
Main Authors: Ahlgren, Per, Chen, Yunwei, Colliander, Cristian, van Eck, Nees Jan
Format: Article
Language:English
Published: The MIT Press 2020-03-01
Series:Quantitative Science Studies
Online Access:https://www.mitpressjournals.org/doi/abs/10.1162/qss_a_00027
id doaj-751d1caa4462446289d976f97707ec4c
record_format Article
spelling doaj-751d1caa4462446289d976f97707ec4c2020-11-25T03:26:42ZengThe MIT PressQuantitative Science Studies2641-33372020-03-0111610.1162/qss_a_00027Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publicationsAhlgren, PerChen, YunweiColliander, Cristianvan Eck, Nees Jan The effects of enhancing direct citations, with respect to publication–publication relatedness measurement, by indirect citation relations (bibliographic coupling, cocitation, and extended direct citations) and text relations on clustering solution accuracy are analyzed. For comparison, we include each approach that is involved in the enhancement of direct citations. In total, we investigate the relative performance of seven approaches. To evaluate the approaches we use a methodology proposed by earlier research. However, the evaluation criterion used is based on MeSH, one of the most sophisticated publication-level classification schemes available. We also introduce an approach, based on interpolated accuracy values, by which overall relative clustering solution accuracy can be studied. The results show that the cocitation approach has the worst performance, and that the direct citations approach is outperformed by the other five investigated approaches. The extended direct citations approach has the best performance, followed by an approach in which direct citations are enhanced by the BM25 textual relatedness measure. An approach that combines direct citations with bibliographic coupling and cocitation performs slightly better than the bibliographic coupling approach, which in turn has a better performance than the BM25 approach. https://www.mitpressjournals.org/doi/abs/10.1162/qss_a_00027
collection DOAJ
language English
format Article
sources DOAJ
author Ahlgren, Per
Chen, Yunwei
Colliander, Cristian
van Eck, Nees Jan
spellingShingle Ahlgren, Per
Chen, Yunwei
Colliander, Cristian
van Eck, Nees Jan
Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications
Quantitative Science Studies
author_facet Ahlgren, Per
Chen, Yunwei
Colliander, Cristian
van Eck, Nees Jan
author_sort Ahlgren, Per
title Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications
title_short Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications
title_full Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications
title_fullStr Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications
title_full_unstemmed Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications
title_sort enhancing direct citations: a comparison of relatedness measures for community detection in a large set of pubmed publications
publisher The MIT Press
series Quantitative Science Studies
issn 2641-3337
publishDate 2020-03-01
description The effects of enhancing direct citations, with respect to publication–publication relatedness measurement, by indirect citation relations (bibliographic coupling, cocitation, and extended direct citations) and text relations on clustering solution accuracy are analyzed. For comparison, we include each approach that is involved in the enhancement of direct citations. In total, we investigate the relative performance of seven approaches. To evaluate the approaches we use a methodology proposed by earlier research. However, the evaluation criterion used is based on MeSH, one of the most sophisticated publication-level classification schemes available. We also introduce an approach, based on interpolated accuracy values, by which overall relative clustering solution accuracy can be studied. The results show that the cocitation approach has the worst performance, and that the direct citations approach is outperformed by the other five investigated approaches. The extended direct citations approach has the best performance, followed by an approach in which direct citations are enhanced by the BM25 textual relatedness measure. An approach that combines direct citations with bibliographic coupling and cocitation performs slightly better than the bibliographic coupling approach, which in turn has a better performance than the BM25 approach.
url https://www.mitpressjournals.org/doi/abs/10.1162/qss_a_00027
work_keys_str_mv AT ahlgrenper enhancingdirectcitationsacomparisonofrelatednessmeasuresforcommunitydetectioninalargesetofpubmedpublications
AT chenyunwei enhancingdirectcitationsacomparisonofrelatednessmeasuresforcommunitydetectioninalargesetofpubmedpublications
AT colliandercristian enhancingdirectcitationsacomparisonofrelatednessmeasuresforcommunitydetectioninalargesetofpubmedpublications
AT vaneckneesjan enhancingdirectcitationsacomparisonofrelatednessmeasuresforcommunitydetectioninalargesetofpubmedpublications
_version_ 1724591069020028928