Abundant Topological Outliers in Social Media Data and Their Effect on Spatial Analysis.

Twitter and related social media feeds have become valuable data sources to many fields of research. Numerous researchers have thereby used social media posts for spatial analysis, since many of them contain explicit geographic locations. However, despite its widespread use within applied research,...

Full description

Bibliographic Details
Main Authors: Rene Westerholt, Enrico Steiger, Bernd Resch, Alexander Zipf
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2016-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC5017681?pdf=render
id doaj-e74378079c3e4eddbc7ee2027226cafb
record_format Article
spelling doaj-e74378079c3e4eddbc7ee2027226cafb2020-11-24T21:35:48ZengPublic Library of Science (PLoS)PLoS ONE1932-62032016-01-01119e016236010.1371/journal.pone.0162360Abundant Topological Outliers in Social Media Data and Their Effect on Spatial Analysis.Rene WesterholtEnrico SteigerBernd ReschAlexander ZipfTwitter and related social media feeds have become valuable data sources to many fields of research. Numerous researchers have thereby used social media posts for spatial analysis, since many of them contain explicit geographic locations. However, despite its widespread use within applied research, a thorough understanding of the underlying spatial characteristics of these data is still lacking. In this paper, we investigate how topological outliers influence the outcomes of spatial analyses of social media data. These outliers appear when different users contribute heterogeneous information about different phenomena simultaneously from similar locations. As a consequence, various messages representing different spatial phenomena are captured closely to each other, and are at risk to be falsely related in a spatial analysis. Our results reveal indications for corresponding spurious effects when analyzing Twitter data. Further, we show how the outliers distort the range of outcomes of spatial analysis methods. This has significant influence on the power of spatial inferential techniques, and, more generally, on the validity and interpretability of spatial analysis results. We further investigate how the issues caused by topological outliers are composed in detail. We unveil that multiple disturbing effects are acting simultaneously and that these are related to the geographic scales of the involved overlapping patterns. Our results show that at some scale configurations, the disturbances added through overlap are more severe than at others. Further, their behavior turns into a volatile and almost chaotic fluctuation when the scales of the involved patterns become too different. Overall, our results highlight the critical importance of thoroughly considering the specific characteristics of social media data when analyzing them spatially.http://europepmc.org/articles/PMC5017681?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Rene Westerholt
Enrico Steiger
Bernd Resch
Alexander Zipf
spellingShingle Rene Westerholt
Enrico Steiger
Bernd Resch
Alexander Zipf
Abundant Topological Outliers in Social Media Data and Their Effect on Spatial Analysis.
PLoS ONE
author_facet Rene Westerholt
Enrico Steiger
Bernd Resch
Alexander Zipf
author_sort Rene Westerholt
title Abundant Topological Outliers in Social Media Data and Their Effect on Spatial Analysis.
title_short Abundant Topological Outliers in Social Media Data and Their Effect on Spatial Analysis.
title_full Abundant Topological Outliers in Social Media Data and Their Effect on Spatial Analysis.
title_fullStr Abundant Topological Outliers in Social Media Data and Their Effect on Spatial Analysis.
title_full_unstemmed Abundant Topological Outliers in Social Media Data and Their Effect on Spatial Analysis.
title_sort abundant topological outliers in social media data and their effect on spatial analysis.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2016-01-01
description Twitter and related social media feeds have become valuable data sources to many fields of research. Numerous researchers have thereby used social media posts for spatial analysis, since many of them contain explicit geographic locations. However, despite its widespread use within applied research, a thorough understanding of the underlying spatial characteristics of these data is still lacking. In this paper, we investigate how topological outliers influence the outcomes of spatial analyses of social media data. These outliers appear when different users contribute heterogeneous information about different phenomena simultaneously from similar locations. As a consequence, various messages representing different spatial phenomena are captured closely to each other, and are at risk to be falsely related in a spatial analysis. Our results reveal indications for corresponding spurious effects when analyzing Twitter data. Further, we show how the outliers distort the range of outcomes of spatial analysis methods. This has significant influence on the power of spatial inferential techniques, and, more generally, on the validity and interpretability of spatial analysis results. We further investigate how the issues caused by topological outliers are composed in detail. We unveil that multiple disturbing effects are acting simultaneously and that these are related to the geographic scales of the involved overlapping patterns. Our results show that at some scale configurations, the disturbances added through overlap are more severe than at others. Further, their behavior turns into a volatile and almost chaotic fluctuation when the scales of the involved patterns become too different. Overall, our results highlight the critical importance of thoroughly considering the specific characteristics of social media data when analyzing them spatially.
url http://europepmc.org/articles/PMC5017681?pdf=render
work_keys_str_mv AT renewesterholt abundanttopologicaloutliersinsocialmediadataandtheireffectonspatialanalysis
AT enricosteiger abundanttopologicaloutliersinsocialmediadataandtheireffectonspatialanalysis
AT berndresch abundanttopologicaloutliersinsocialmediadataandtheireffectonspatialanalysis
AT alexanderzipf abundanttopologicaloutliersinsocialmediadataandtheireffectonspatialanalysis
_version_ 1725943855651487744