The MCountP-Tree for Mining Maximal Spatial Co-Location Patterns from Spatial Data Sets

碩士 === 國立中山大學 === 資訊工程學系研究所 === 101 === In recent years, the geographic information system (GIS) databases develop quickly and play a significant role in many applications. How to efficient mine the maximal co-location patterns in the explosive growth of spatial data is an important issue in spatial...

Full description

Bibliographic Details
Main Authors: Cheng-Hung Wang, 王政鴻
Other Authors: Ye-In Chang
Format: Others
Language:en_US
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/37059201510721094312
id ndltd-TW-101NSYS5392025
record_format oai_dc
spelling ndltd-TW-101NSYS53920252017-03-22T04:42:32Z http://ndltd.ncl.edu.tw/handle/37059201510721094312 The MCountP-Tree for Mining Maximal Spatial Co-Location Patterns from Spatial Data Sets 一個以MCountP-Tree來探勘空間資料集合中的最大空間共同位置樣式之方法 Cheng-Hung Wang 王政鴻 碩士 國立中山大學 資訊工程學系研究所 101 In recent years, the geographic information system (GIS) databases develop quickly and play a significant role in many applications. How to efficient mine the maximal co-location patterns in the explosive growth of spatial data is an important issue in spatial data mining. The applications of spatial mining include mobile service request, and public health, public safety. Most of researches (the full-join, the partial-join, the join-less), join-based approaches, adopt the Apriori-like approach to mine the maximal co-location patterns. However, the Apriori-like approach has very expensive computation cost. Because the Apriori-like approach generate size-k prevalence co-locations after size-(k - 1) prevalence co-locations. In order to decrease computation cost of those join-based approaches, Lizhen Wang et al. have proposed an order-clique approach for mining the maximal co-location patterns. This approach is different from those join-based approaches, because it finds candidates of the maximal co-locations candidates first. They use tree data structures to mine the maximal co-location patterns, instead of table instances used in those join-based approaches. Therefore, the performance of the order-clique approach is better than that of those join-based approaches. However, when the threshold increases, the performance of the order-clique approach would not be good due to no use of the pruning strategy. Therefore, in this thesis, we propose a new approach with a pruning strategy to mine the maximal co-location patterns. Our approach would be more accurate than the order-clique approach to find the candidates of maximal co-location patterns, because we use a pruning strategy in the candidates of size 2. In our approach, we propose four tree data structures which include the CountP -tree, the MCountP -tree, the NeighborI-tree, and the CoLI-tree. The advantage of the CountP -tree is to prune the size-2 candidates of the maximal co-location patterns, which is different from pruning instances as used in those join-based approaches. The MCountP -tree can show the candidates of the maximal co-location patterns. The number of candidates of the maximal co-location patterns founded by our approach is smaller than that founded by the order-clique approach. The NeighborI-tree records every instance relation. The CoLI-tree is built from the result of the the MCountP -tree by referring to the NeighborI-tree to decide the final result. From our simulation results, we show that our proposed approach is more efficient than the order-clique approach no matter the data set is sparse or dense. Ye-In Chang 張玉盈 2013 學位論文 ; thesis 92 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立中山大學 === 資訊工程學系研究所 === 101 === In recent years, the geographic information system (GIS) databases develop quickly and play a significant role in many applications. How to efficient mine the maximal co-location patterns in the explosive growth of spatial data is an important issue in spatial data mining. The applications of spatial mining include mobile service request, and public health, public safety. Most of researches (the full-join, the partial-join, the join-less), join-based approaches, adopt the Apriori-like approach to mine the maximal co-location patterns. However, the Apriori-like approach has very expensive computation cost. Because the Apriori-like approach generate size-k prevalence co-locations after size-(k - 1) prevalence co-locations. In order to decrease computation cost of those join-based approaches, Lizhen Wang et al. have proposed an order-clique approach for mining the maximal co-location patterns. This approach is different from those join-based approaches, because it finds candidates of the maximal co-locations candidates first. They use tree data structures to mine the maximal co-location patterns, instead of table instances used in those join-based approaches. Therefore, the performance of the order-clique approach is better than that of those join-based approaches. However, when the threshold increases, the performance of the order-clique approach would not be good due to no use of the pruning strategy. Therefore, in this thesis, we propose a new approach with a pruning strategy to mine the maximal co-location patterns. Our approach would be more accurate than the order-clique approach to find the candidates of maximal co-location patterns, because we use a pruning strategy in the candidates of size 2. In our approach, we propose four tree data structures which include the CountP -tree, the MCountP -tree, the NeighborI-tree, and the CoLI-tree. The advantage of the CountP -tree is to prune the size-2 candidates of the maximal co-location patterns, which is different from pruning instances as used in those join-based approaches. The MCountP -tree can show the candidates of the maximal co-location patterns. The number of candidates of the maximal co-location patterns founded by our approach is smaller than that founded by the order-clique approach. The NeighborI-tree records every instance relation. The CoLI-tree is built from the result of the the MCountP -tree by referring to the NeighborI-tree to decide the final result. From our simulation results, we show that our proposed approach is more efficient than the order-clique approach no matter the data set is sparse or dense.
author2 Ye-In Chang
author_facet Ye-In Chang
Cheng-Hung Wang
王政鴻
author Cheng-Hung Wang
王政鴻
spellingShingle Cheng-Hung Wang
王政鴻
The MCountP-Tree for Mining Maximal Spatial Co-Location Patterns from Spatial Data Sets
author_sort Cheng-Hung Wang
title The MCountP-Tree for Mining Maximal Spatial Co-Location Patterns from Spatial Data Sets
title_short The MCountP-Tree for Mining Maximal Spatial Co-Location Patterns from Spatial Data Sets
title_full The MCountP-Tree for Mining Maximal Spatial Co-Location Patterns from Spatial Data Sets
title_fullStr The MCountP-Tree for Mining Maximal Spatial Co-Location Patterns from Spatial Data Sets
title_full_unstemmed The MCountP-Tree for Mining Maximal Spatial Co-Location Patterns from Spatial Data Sets
title_sort mcountp-tree for mining maximal spatial co-location patterns from spatial data sets
publishDate 2013
url http://ndltd.ncl.edu.tw/handle/37059201510721094312
work_keys_str_mv AT chenghungwang themcountptreeforminingmaximalspatialcolocationpatternsfromspatialdatasets
AT wángzhènghóng themcountptreeforminingmaximalspatialcolocationpatternsfromspatialdatasets
AT chenghungwang yīgèyǐmcountptreeláitànkānkōngjiānzīliàojíhézhōngdezuìdàkōngjiāngòngtóngwèizhìyàngshìzhīfāngfǎ
AT wángzhènghóng yīgèyǐmcountptreeláitànkānkōngjiānzīliàojíhézhōngdezuìdàkōngjiāngòngtóngwèizhìyàngshìzhīfāngfǎ
AT chenghungwang mcountptreeforminingmaximalspatialcolocationpatternsfromspatialdatasets
AT wángzhènghóng mcountptreeforminingmaximalspatialcolocationpatternsfromspatialdatasets
_version_ 1718433741143015424