Summary: | The scale effect is an important research topic in the field of geography. When aggregating individual-level data into areal units, encountering the scale problem is inevitable. This problem is more substantial when mining collective patterns from big geo-data due to the characteristics of extensive spatial data. Although multi-scale models were constructed to mitigate this issue, most studies still arbitrarily choose a single scale to extract spatial patterns. In this research, we introduce the nugget-sill ratio (NSR) derived from semi-variograms as an indicator to extract the optimal scale. We conducted two simulated experiments to demonstrate the feasibility of this method. Our results showed that the optimal scale is negatively correlated with spatial point density, but positively correlated with the degree of dispersion in a point pattern. We also applied the proposed method to a case study using Weibo check-in data from Beijing, Shanghai, Chengdu, and Wuhan. Our study provides a new perspective to measure the spatial heterogeneity of big geo-data and selects an optimal spatial scale for big data analytics.
|