Tree-Based and Optimum Cut-Based Origin-Destination Flow Clustering

Data about the movements of diverse objects, including human beings, animals, and commodities, are collected in growing amounts as location-aware technologies become pervasive. Clustering has become an increasingly important analytical tool for revealing travel patterns from large-scale movement dat...

Full description

Bibliographic Details
Main Authors: Qiuliang Xiang, Qunyong Wu
Format: Article
Language:English
Published: MDPI AG 2019-10-01
Series:ISPRS International Journal of Geo-Information
Subjects:
Online Access:https://www.mdpi.com/2220-9964/8/11/477
id doaj-7e72e50cc3b049969ac8ac4cf8305d50
record_format Article
spelling doaj-7e72e50cc3b049969ac8ac4cf8305d502020-11-25T02:15:41ZengMDPI AGISPRS International Journal of Geo-Information2220-99642019-10-0181147710.3390/ijgi8110477ijgi8110477Tree-Based and Optimum Cut-Based Origin-Destination Flow ClusteringQiuliang Xiang0Qunyong Wu1Key Lab of Spatial Data Mining and Information Sharing of Ministry of Education, Fuzhou University, Fuzhou 350108, ChinaKey Lab of Spatial Data Mining and Information Sharing of Ministry of Education, Fuzhou University, Fuzhou 350108, ChinaData about the movements of diverse objects, including human beings, animals, and commodities, are collected in growing amounts as location-aware technologies become pervasive. Clustering has become an increasingly important analytical tool for revealing travel patterns from large-scale movement datasets. Most existing methods for origin-destination (OD) flow clustering focus on the geographic properties of an OD flow but ignore the temporal information preserved in the OD flow, which reflects the dynamic changes in the travel patterns over time. In addition, most methods require some predetermined parameters as inputs and are difficult to adjust considering the changes in the users&#8217; demands. To overcome such limitations, we present a novel OD flow clustering method, namely, TOCOFC (Tree-based and Optimum Cut-based Origin-Destination Flow Clustering). A similarity measurement method is proposed to quantify the spatial similarity relationship between OD flows, and it can be extended to measure the spatiotemporal similarity between OD flows. By constructing a maximum spanning tree and splitting it into several unrelated parts, we effectively remove the noise in the flow data. Furthermore, a recursive two-way optimum cut-based method is utilized to partition the graph composed of OD flows into OD flow clusters. Moreover, a criterion called <i>CSSC</i> (Child tree/Child graph Self-Similarity Criterion) is formulated to determine if the clusters meet the output requirements. By modifying the parameters, TOCOFC can obtain clustering results for different time scales and spatial scales, which makes it possible to study movement patterns from a multiscale perspective. However, TOCOFC has the disadvantages of low efficiency and large memory consumption, and it is not conducive to quickly handling large-scale data. Compared with previous works, TOCOFC has a better clustering performance, which is reflected in the fact that TOCOFC can guarantee a balance between clusters and help to fully understand the corresponding patterns. Being able to perform the spatiotemporal clustering of OD flows is also a highlight of TOCOFC, which will help to capture the differences in the patterns at different times for a deeper analysis. Extensive experiments on both artificial spatial datasets and real-world spatiotemporal datasets have demonstrated the effectiveness and flexibility of TOCOFC.https://www.mdpi.com/2220-9964/8/11/477od flowspatial clusteringspatiotemporal joint clusteringflow similarity measurement
collection DOAJ
language English
format Article
sources DOAJ
author Qiuliang Xiang
Qunyong Wu
spellingShingle Qiuliang Xiang
Qunyong Wu
Tree-Based and Optimum Cut-Based Origin-Destination Flow Clustering
ISPRS International Journal of Geo-Information
od flow
spatial clustering
spatiotemporal joint clustering
flow similarity measurement
author_facet Qiuliang Xiang
Qunyong Wu
author_sort Qiuliang Xiang
title Tree-Based and Optimum Cut-Based Origin-Destination Flow Clustering
title_short Tree-Based and Optimum Cut-Based Origin-Destination Flow Clustering
title_full Tree-Based and Optimum Cut-Based Origin-Destination Flow Clustering
title_fullStr Tree-Based and Optimum Cut-Based Origin-Destination Flow Clustering
title_full_unstemmed Tree-Based and Optimum Cut-Based Origin-Destination Flow Clustering
title_sort tree-based and optimum cut-based origin-destination flow clustering
publisher MDPI AG
series ISPRS International Journal of Geo-Information
issn 2220-9964
publishDate 2019-10-01
description Data about the movements of diverse objects, including human beings, animals, and commodities, are collected in growing amounts as location-aware technologies become pervasive. Clustering has become an increasingly important analytical tool for revealing travel patterns from large-scale movement datasets. Most existing methods for origin-destination (OD) flow clustering focus on the geographic properties of an OD flow but ignore the temporal information preserved in the OD flow, which reflects the dynamic changes in the travel patterns over time. In addition, most methods require some predetermined parameters as inputs and are difficult to adjust considering the changes in the users&#8217; demands. To overcome such limitations, we present a novel OD flow clustering method, namely, TOCOFC (Tree-based and Optimum Cut-based Origin-Destination Flow Clustering). A similarity measurement method is proposed to quantify the spatial similarity relationship between OD flows, and it can be extended to measure the spatiotemporal similarity between OD flows. By constructing a maximum spanning tree and splitting it into several unrelated parts, we effectively remove the noise in the flow data. Furthermore, a recursive two-way optimum cut-based method is utilized to partition the graph composed of OD flows into OD flow clusters. Moreover, a criterion called <i>CSSC</i> (Child tree/Child graph Self-Similarity Criterion) is formulated to determine if the clusters meet the output requirements. By modifying the parameters, TOCOFC can obtain clustering results for different time scales and spatial scales, which makes it possible to study movement patterns from a multiscale perspective. However, TOCOFC has the disadvantages of low efficiency and large memory consumption, and it is not conducive to quickly handling large-scale data. Compared with previous works, TOCOFC has a better clustering performance, which is reflected in the fact that TOCOFC can guarantee a balance between clusters and help to fully understand the corresponding patterns. Being able to perform the spatiotemporal clustering of OD flows is also a highlight of TOCOFC, which will help to capture the differences in the patterns at different times for a deeper analysis. Extensive experiments on both artificial spatial datasets and real-world spatiotemporal datasets have demonstrated the effectiveness and flexibility of TOCOFC.
topic od flow
spatial clustering
spatiotemporal joint clustering
flow similarity measurement
url https://www.mdpi.com/2220-9964/8/11/477
work_keys_str_mv AT qiuliangxiang treebasedandoptimumcutbasedorigindestinationflowclustering
AT qunyongwu treebasedandoptimumcutbasedorigindestinationflowclustering
_version_ 1724894628019175424