Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition

The process of syntactic pattern recognition makes the analogy between the syntax of languages and the structure of spatial patterns. The recognition process is achieved by parsing a given pattern to determine if it is syntactically correct with respect to a defined grammar. The generation of patter...

Full description

Bibliographic Details
Main Author: Leighty, Brian David
Format: Others
Published: NSUWorks 2009
Subjects:
Online Access:http://nsuworks.nova.edu/gscis_etd/212
http://nsuworks.nova.edu/cgi/viewcontent.cgi?article=1211&context=gscis_etd
id ndltd-nova.edu-oai-nsuworks.nova.edu-gscis_etd-1211
record_format oai_dc
spelling ndltd-nova.edu-oai-nsuworks.nova.edu-gscis_etd-12112016-10-20T03:59:12Z Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition Leighty, Brian David The process of syntactic pattern recognition makes the analogy between the syntax of languages and the structure of spatial patterns. The recognition process is achieved by parsing a given pattern to determine if it is syntactically correct with respect to a defined grammar. The generation of pattern grammars can be a cumbersome process when many objects are involved. This has led to the problem of spatial grammar inference. Current approaches have used genetic algorithms and inductive techniques and have demonstrated limitations. Alternative approaches are needed that produce accurate grammars while remaining computationally efficient in light of the NP-hardness of the problem. Co-location rule mining techniques in the field of Knowledge Discovery and Data Mining address the complexity issue using neighborhood restrictions and pruning strategies based on monotonic Measures Of Interest. The goal of this research was to develop and evaluate an inductive method for inferring an adjacency grammar utilizing co-location rule mining techniques to gain efficiency while providing accurate and concise production sets. The method incrementally discovers, without supervision, adjacency patterns in spatial samples, relabels them via a production rule and repeats the procedure with the newly labeled regions. The resulting rules are used to form an adjacency grammar. Grammars were generated and evaluated within the context of a syntactic pattern recognition system that identifies landform patterns in terrain elevation datasets. The proposed method was tested using a k-fold cross-validation methodology. Two variations were also tested using unsupervised and supervised training, both with no rule pruning. Comparison of these variations with the proposed method demonstrated the effectiveness of rule pruning and rule discovery. Results showed that the proposed method of rule inference produced rulesets having recall, precision and accuracy values of 82.6%, 97.7% and 92.8%, respectively, which are similar to those using supervised training. These rulesets were also the smallest, had the lowest average number of rules fired in parsing, and had the shortest average parse time. The use of rule pruning substantially reduced rule inference time (104.4 s vs. 208.9 s). The neighborhood restriction used in adjacency calculations demonstrated linear complexity in the number of regions. 2009-01-01T08:00:00Z text application/pdf http://nsuworks.nova.edu/gscis_etd/212 http://nsuworks.nova.edu/cgi/viewcontent.cgi?article=1211&context=gscis_etd CEC Theses and Dissertations NSUWorks grammar induction mining patterns spatial terrain Computer Sciences
collection NDLTD
format Others
sources NDLTD
topic grammar
induction
mining
patterns
spatial
terrain
Computer Sciences
spellingShingle grammar
induction
mining
patterns
spatial
terrain
Computer Sciences
Leighty, Brian David
Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition
description The process of syntactic pattern recognition makes the analogy between the syntax of languages and the structure of spatial patterns. The recognition process is achieved by parsing a given pattern to determine if it is syntactically correct with respect to a defined grammar. The generation of pattern grammars can be a cumbersome process when many objects are involved. This has led to the problem of spatial grammar inference. Current approaches have used genetic algorithms and inductive techniques and have demonstrated limitations. Alternative approaches are needed that produce accurate grammars while remaining computationally efficient in light of the NP-hardness of the problem. Co-location rule mining techniques in the field of Knowledge Discovery and Data Mining address the complexity issue using neighborhood restrictions and pruning strategies based on monotonic Measures Of Interest. The goal of this research was to develop and evaluate an inductive method for inferring an adjacency grammar utilizing co-location rule mining techniques to gain efficiency while providing accurate and concise production sets. The method incrementally discovers, without supervision, adjacency patterns in spatial samples, relabels them via a production rule and repeats the procedure with the newly labeled regions. The resulting rules are used to form an adjacency grammar. Grammars were generated and evaluated within the context of a syntactic pattern recognition system that identifies landform patterns in terrain elevation datasets. The proposed method was tested using a k-fold cross-validation methodology. Two variations were also tested using unsupervised and supervised training, both with no rule pruning. Comparison of these variations with the proposed method demonstrated the effectiveness of rule pruning and rule discovery. Results showed that the proposed method of rule inference produced rulesets having recall, precision and accuracy values of 82.6%, 97.7% and 92.8%, respectively, which are similar to those using supervised training. These rulesets were also the smallest, had the lowest average number of rules fired in parsing, and had the shortest average parse time. The use of rule pruning substantially reduced rule inference time (104.4 s vs. 208.9 s). The neighborhood restriction used in adjacency calculations demonstrated linear complexity in the number of regions.
author Leighty, Brian David
author_facet Leighty, Brian David
author_sort Leighty, Brian David
title Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition
title_short Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition
title_full Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition
title_fullStr Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition
title_full_unstemmed Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition
title_sort data mining for induction of adjacency grammars and application to terrain pattern recognition
publisher NSUWorks
publishDate 2009
url http://nsuworks.nova.edu/gscis_etd/212
http://nsuworks.nova.edu/cgi/viewcontent.cgi?article=1211&context=gscis_etd
work_keys_str_mv AT leightybriandavid dataminingforinductionofadjacencygrammarsandapplicationtoterrainpatternrecognition
_version_ 1718387629449281536