Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition

The process of syntactic pattern recognition makes the analogy between the syntax of languages and the structure of spatial patterns. The recognition process is achieved by parsing a given pattern to determine if it is syntactically correct with respect to a defined grammar. The generation of patter...

Full description

Bibliographic Details
Main Author:	Leighty, Brian David
Format:	Others
Published:	NSUWorks 2009
Subjects:	grammar induction mining patterns spatial terrain Computer Sciences
Online Access:	http://nsuworks.nova.edu/gscis_etd/212 http://nsuworks.nova.edu/cgi/viewcontent.cgi?article=1211&context=gscis_etd

id	ndltd-nova.edu-oai-nsuworks.nova.edu-gscis_etd-1211
record_format	oai_dc
spelling	ndltd-nova.edu-oai-nsuworks.nova.edu-gscis_etd-12112016-10-20T03:59:12Z Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition Leighty, Brian David The process of syntactic pattern recognition makes the analogy between the syntax of languages and the structure of spatial patterns. The recognition process is achieved by parsing a given pattern to determine if it is syntactically correct with respect to a defined grammar. The generation of pattern grammars can be a cumbersome process when many objects are involved. This has led to the problem of spatial grammar inference. Current approaches have used genetic algorithms and inductive techniques and have demonstrated limitations. Alternative approaches are needed that produce accurate grammars while remaining computationally efficient in light of the NP-hardness of the problem. Co-location rule mining techniques in the field of Knowledge Discovery and Data Mining address the complexity issue using neighborhood restrictions and pruning strategies based on monotonic Measures Of Interest. The goal of this research was to develop and evaluate an inductive method for inferring an adjacency grammar utilizing co-location rule mining techniques to gain efficiency while providing accurate and concise production sets. The method incrementally discovers, without supervision, adjacency patterns in spatial samples, relabels them via a production rule and repeats the procedure with the newly labeled regions. The resulting rules are used to form an adjacency grammar. Grammars were generated and evaluated within the context of a syntactic pattern recognition system that identifies landform patterns in terrain elevation datasets. The proposed method was tested using a k-fold cross-validation methodology. Two variations were also tested using unsupervised and supervised training, both with no rule pruning. Comparison of these variations with the proposed method demonstrated the effectiveness of rule pruning and rule discovery. Results showed that the proposed method of rule inference produced rulesets having recall, precision and accuracy values of 82.6%, 97.7% and 92.8%, respectively, which are similar to those using supervised training. These rulesets were also the smallest, had the lowest average number of rules fired in parsing, and had the shortest average parse time. The use of rule pruning substantially reduced rule inference time (104.4 s vs. 208.9 s). The neighborhood restriction used in adjacency calculations demonstrated linear complexity in the number of regions. 2009-01-01T08:00:00Z text application/pdf http://nsuworks.nova.edu/gscis_etd/212 http://nsuworks.nova.edu/cgi/viewcontent.cgi?article=1211&context=gscis_etd CEC Theses and Dissertations NSUWorks grammar induction mining patterns spatial terrain Computer Sciences
collection	NDLTD
format	Others
sources	NDLTD
topic	grammar induction mining patterns spatial terrain Computer Sciences
spellingShingle	grammar induction mining patterns spatial terrain Computer Sciences Leighty, Brian David Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition
description	The process of syntactic pattern recognition makes the analogy between the syntax of languages and the structure of spatial patterns. The recognition process is achieved by parsing a given pattern to determine if it is syntactically correct with respect to a defined grammar. The generation of pattern grammars can be a cumbersome process when many objects are involved. This has led to the problem of spatial grammar inference. Current approaches have used genetic algorithms and inductive techniques and have demonstrated limitations. Alternative approaches are needed that produce accurate grammars while remaining computationally efficient in light of the NP-hardness of the problem. Co-location rule mining techniques in the field of Knowledge Discovery and Data Mining address the complexity issue using neighborhood restrictions and pruning strategies based on monotonic Measures Of Interest. The goal of this research was to develop and evaluate an inductive method for inferring an adjacency grammar utilizing co-location rule mining techniques to gain efficiency while providing accurate and concise production sets. The method incrementally discovers, without supervision, adjacency patterns in spatial samples, relabels them via a production rule and repeats the procedure with the newly labeled regions. The resulting rules are used to form an adjacency grammar. Grammars were generated and evaluated within the context of a syntactic pattern recognition system that identifies landform patterns in terrain elevation datasets. The proposed method was tested using a k-fold cross-validation methodology. Two variations were also tested using unsupervised and supervised training, both with no rule pruning. Comparison of these variations with the proposed method demonstrated the effectiveness of rule pruning and rule discovery. Results showed that the proposed method of rule inference produced rulesets having recall, precision and accuracy values of 82.6%, 97.7% and 92.8%, respectively, which are similar to those using supervised training. These rulesets were also the smallest, had the lowest average number of rules fired in parsing, and had the shortest average parse time. The use of rule pruning substantially reduced rule inference time (104.4 s vs. 208.9 s). The neighborhood restriction used in adjacency calculations demonstrated linear complexity in the number of regions.
author	Leighty, Brian David
author_facet	Leighty, Brian David
author_sort	Leighty, Brian David
title	Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition
title_short	Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition
title_full	Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition
title_fullStr	Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition
title_full_unstemmed	Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition
title_sort	data mining for induction of adjacency grammars and application to terrain pattern recognition
publisher	NSUWorks
publishDate	2009
url	http://nsuworks.nova.edu/gscis_etd/212 http://nsuworks.nova.edu/cgi/viewcontent.cgi?article=1211&context=gscis_etd
work_keys_str_mv	AT leightybriandavid dataminingforinductionofadjacencygrammarsandapplicationtoterrainpatternrecognition
_version_	1718387629449281536

Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition

Similar Items