Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications
The effects of enhancing direct citations, with respect to publication–publication relatedness measurement, by indirect citation relations (bibliographic coupling, cocitation, and extended direct citations) and text relations on clustering solution accuracy are analyzed. For co...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
The MIT Press
2020-03-01
|
Series: | Quantitative Science Studies |
Online Access: | https://www.mitpressjournals.org/doi/abs/10.1162/qss_a_00027 |
id |
doaj-751d1caa4462446289d976f97707ec4c |
---|---|
record_format |
Article |
spelling |
doaj-751d1caa4462446289d976f97707ec4c2020-11-25T03:26:42ZengThe MIT PressQuantitative Science Studies2641-33372020-03-0111610.1162/qss_a_00027Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publicationsAhlgren, PerChen, YunweiColliander, Cristianvan Eck, Nees Jan The effects of enhancing direct citations, with respect to publication–publication relatedness measurement, by indirect citation relations (bibliographic coupling, cocitation, and extended direct citations) and text relations on clustering solution accuracy are analyzed. For comparison, we include each approach that is involved in the enhancement of direct citations. In total, we investigate the relative performance of seven approaches. To evaluate the approaches we use a methodology proposed by earlier research. However, the evaluation criterion used is based on MeSH, one of the most sophisticated publication-level classification schemes available. We also introduce an approach, based on interpolated accuracy values, by which overall relative clustering solution accuracy can be studied. The results show that the cocitation approach has the worst performance, and that the direct citations approach is outperformed by the other five investigated approaches. The extended direct citations approach has the best performance, followed by an approach in which direct citations are enhanced by the BM25 textual relatedness measure. An approach that combines direct citations with bibliographic coupling and cocitation performs slightly better than the bibliographic coupling approach, which in turn has a better performance than the BM25 approach. https://www.mitpressjournals.org/doi/abs/10.1162/qss_a_00027 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Ahlgren, Per Chen, Yunwei Colliander, Cristian van Eck, Nees Jan |
spellingShingle |
Ahlgren, Per Chen, Yunwei Colliander, Cristian van Eck, Nees Jan Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications Quantitative Science Studies |
author_facet |
Ahlgren, Per Chen, Yunwei Colliander, Cristian van Eck, Nees Jan |
author_sort |
Ahlgren, Per |
title |
Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications |
title_short |
Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications |
title_full |
Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications |
title_fullStr |
Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications |
title_full_unstemmed |
Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications |
title_sort |
enhancing direct citations: a comparison of relatedness measures for community detection in a large set of pubmed publications |
publisher |
The MIT Press |
series |
Quantitative Science Studies |
issn |
2641-3337 |
publishDate |
2020-03-01 |
description |
The effects of enhancing direct citations, with respect to publication–publication relatedness measurement, by indirect citation relations (bibliographic coupling, cocitation, and extended direct citations) and text relations on clustering solution accuracy are analyzed. For comparison, we include each approach that is involved in the enhancement of direct citations. In total, we investigate the relative performance of seven approaches. To evaluate the approaches we use a methodology proposed by earlier research. However, the evaluation criterion used is based on MeSH, one of the most sophisticated publication-level classification schemes available. We also introduce an approach, based on interpolated accuracy values, by which overall relative clustering solution accuracy can be studied. The results show that the cocitation approach has the worst performance, and that the direct citations approach is outperformed by the other five investigated
approaches. The extended direct citations approach has the best performance, followed by an approach in which direct citations are enhanced by the BM25 textual relatedness measure. An approach that combines direct citations with bibliographic coupling and cocitation performs slightly better than the bibliographic coupling approach, which in turn has a better performance than the BM25 approach. |
url |
https://www.mitpressjournals.org/doi/abs/10.1162/qss_a_00027 |
work_keys_str_mv |
AT ahlgrenper enhancingdirectcitationsacomparisonofrelatednessmeasuresforcommunitydetectioninalargesetofpubmedpublications AT chenyunwei enhancingdirectcitationsacomparisonofrelatednessmeasuresforcommunitydetectioninalargesetofpubmedpublications AT colliandercristian enhancingdirectcitationsacomparisonofrelatednessmeasuresforcommunitydetectioninalargesetofpubmedpublications AT vaneckneesjan enhancingdirectcitationsacomparisonofrelatednessmeasuresforcommunitydetectioninalargesetofpubmedpublications |
_version_ |
1724591069020028928 |