Relevance Assessment of Crowdsourced Data (CSD) Using Semantics and Geographic Information Retrieval (GIR) Techniques

Crowdsourced data (CSD) generated by citizens is becoming more popular as its potential utilization in many applications increases due to its currency and availability. However, the quality of CSD, including its relevance, is often questioned as the data is not generated by professionals nor follows...

Full description

Bibliographic Details
Main Authors:	Saman Koswatte, Kevin McDougall, Xiaoye Liu
Format:	Article
Language:	English
Published:	MDPI AG 2018-06-01
Series:	ISPRS International Journal of Geo-Information
Subjects:	crowdsourced data relevance semantics geographic information retrieval natural language processing
Online Access:	http://www.mdpi.com/2220-9964/7/7/256

id	doaj-ec07f458cc7d491ebfbcd723da192d63
record_format	Article
spelling	doaj-ec07f458cc7d491ebfbcd723da192d632020-11-24T23:14:19ZengMDPI AGISPRS International Journal of Geo-Information2220-99642018-06-017725610.3390/ijgi7070256ijgi7070256Relevance Assessment of Crowdsourced Data (CSD) Using Semantics and Geographic Information Retrieval (GIR) TechniquesSaman Koswatte0Kevin McDougall1Xiaoye Liu2School of Civil Engineering and Surveying, University of Southern Queensland, Darling Heights 4350, AustraliaSchool of Civil Engineering and Surveying, University of Southern Queensland, Darling Heights 4350, AustraliaSchool of Civil Engineering and Surveying, University of Southern Queensland, Darling Heights 4350, AustraliaCrowdsourced data (CSD) generated by citizens is becoming more popular as its potential utilization in many applications increases due to its currency and availability. However, the quality of CSD, including its relevance, is often questioned as the data is not generated by professionals nor follows standard data-collection procedures. The quality of CSD can be assessed according to a range of characteristics including its relevance. In this paper, information relevance has been explored through using geographic information retrieval (GIR) techniques to identify the most highly relevant information from a set of crowdsourced data. This research tested a relevance assessment approach for CSD by adapting relevance assessment techniques available in the GIR domain. Thematic and geographic relevance were assessed by analyzing the frequency of selected terms which appeared in CSD reports using natural language processing techniques. The study analyzed crowdsourced reports from the 2011 Australian flood’s Crowdmap to examine a proof of concept on relevance assessment using a subset of this dataset based on a defined set of queries. The results determined that the thematic and geographic specificities of the queries were 0.44 and 0.67, respectively, which indicated the queries used were more geographically specific than thematically specific. The Spearman’s rho value of 0.62 indicated that the final ranked relevance lists showed reasonable agreement with a manually classified list and confirmed the potential of the approach for CSD relevance assessment. In particular, this research has contributed to the field of CSD relevance assessment through an integrated thematic and geographic relevance ranking process by using a user-query specificity approach to improve the final ranking.http://www.mdpi.com/2220-9964/7/7/256crowdsourced datarelevancesemanticsgeographic information retrievalnatural language processing
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Saman Koswatte Kevin McDougall Xiaoye Liu
spellingShingle	Saman Koswatte Kevin McDougall Xiaoye Liu Relevance Assessment of Crowdsourced Data (CSD) Using Semantics and Geographic Information Retrieval (GIR) Techniques ISPRS International Journal of Geo-Information crowdsourced data relevance semantics geographic information retrieval natural language processing
author_facet	Saman Koswatte Kevin McDougall Xiaoye Liu
author_sort	Saman Koswatte
title	Relevance Assessment of Crowdsourced Data (CSD) Using Semantics and Geographic Information Retrieval (GIR) Techniques
title_short	Relevance Assessment of Crowdsourced Data (CSD) Using Semantics and Geographic Information Retrieval (GIR) Techniques
title_full	Relevance Assessment of Crowdsourced Data (CSD) Using Semantics and Geographic Information Retrieval (GIR) Techniques
title_fullStr	Relevance Assessment of Crowdsourced Data (CSD) Using Semantics and Geographic Information Retrieval (GIR) Techniques
title_full_unstemmed	Relevance Assessment of Crowdsourced Data (CSD) Using Semantics and Geographic Information Retrieval (GIR) Techniques
title_sort	relevance assessment of crowdsourced data (csd) using semantics and geographic information retrieval (gir) techniques
publisher	MDPI AG
series	ISPRS International Journal of Geo-Information
issn	2220-9964
publishDate	2018-06-01
description	Crowdsourced data (CSD) generated by citizens is becoming more popular as its potential utilization in many applications increases due to its currency and availability. However, the quality of CSD, including its relevance, is often questioned as the data is not generated by professionals nor follows standard data-collection procedures. The quality of CSD can be assessed according to a range of characteristics including its relevance. In this paper, information relevance has been explored through using geographic information retrieval (GIR) techniques to identify the most highly relevant information from a set of crowdsourced data. This research tested a relevance assessment approach for CSD by adapting relevance assessment techniques available in the GIR domain. Thematic and geographic relevance were assessed by analyzing the frequency of selected terms which appeared in CSD reports using natural language processing techniques. The study analyzed crowdsourced reports from the 2011 Australian flood’s Crowdmap to examine a proof of concept on relevance assessment using a subset of this dataset based on a defined set of queries. The results determined that the thematic and geographic specificities of the queries were 0.44 and 0.67, respectively, which indicated the queries used were more geographically specific than thematically specific. The Spearman’s rho value of 0.62 indicated that the final ranked relevance lists showed reasonable agreement with a manually classified list and confirmed the potential of the approach for CSD relevance assessment. In particular, this research has contributed to the field of CSD relevance assessment through an integrated thematic and geographic relevance ranking process by using a user-query specificity approach to improve the final ranking.
topic	crowdsourced data relevance semantics geographic information retrieval natural language processing
url	http://www.mdpi.com/2220-9964/7/7/256
work_keys_str_mv	AT samankoswatte relevanceassessmentofcrowdsourceddatacsdusingsemanticsandgeographicinformationretrievalgirtechniques AT kevinmcdougall relevanceassessmentofcrowdsourceddatacsdusingsemanticsandgeographicinformationretrievalgirtechniques AT xiaoyeliu relevanceassessmentofcrowdsourceddatacsdusingsemanticsandgeographicinformationretrievalgirtechniques
_version_	1725594973695377408

Relevance Assessment of Crowdsourced Data (CSD) Using Semantics and Geographic Information Retrieval (GIR) Techniques

Similar Items