Testing effectiveness of genetic algorithms for exploratory data analysis

Approved for public release; distribution is unlimited === Heuristic methods of solving exploratory data analysis problems suffer from one major weakness - uncertainty regarding the optimality of the results. The developers of DaMI (Data Mining Initiative), a genetic algorithm designed to mine the C...

Full description

Bibliographic Details
Main Author: Carter, Jason W.
Other Authors: Bhargava, Hemant K.
Language:English
Published: Monterey, California. Naval Postgraduate School 2012
Online Access:http://hdl.handle.net/10945/9065
id ndltd-nps.edu-oai-calhoun.nps.edu-10945-9065
record_format oai_dc
spelling ndltd-nps.edu-oai-calhoun.nps.edu-10945-90652015-06-16T16:05:58Z Testing effectiveness of genetic algorithms for exploratory data analysis Carter, Jason W. Bhargava, Hemant K. Haga, William J. Naval Postgraduate School Department of Systems Management Approved for public release; distribution is unlimited Heuristic methods of solving exploratory data analysis problems suffer from one major weakness - uncertainty regarding the optimality of the results. The developers of DaMI (Data Mining Initiative), a genetic algorithm designed to mine the CCEP (Comprehensive Clinical Evaluation Program) database in the search for a Persian Gulf War syndrome, proposed a method to overcome this weakness: reproducibility -- the conjecture that consistent convergence on the same solutions is both necessary and sufficient to ensure a genetic algorithm has effectively searched an unknown solution space. We demonstrate the weakness of this conjecture in light of accepted genetic algorithm theory. We then test the conjecture by modifying the CCEP database with the insertion of an interesting solution of known quality and performing a discovery session using DaMI on this modified database. The necessity of reproducibility as a terminating condition is falsified by the algorithm finding the optimal solution without yielding strong reproducibility. The sufficiency of reproducibility as a terminating condition is analyzed by manual examination of the CCEP database in which strong reproducibility was experienced. Ex post facto knowledge of the solution space is used to prove that DaMI had not found the optimal solutions though it gave strong reproducibility, causing us to reject the conjecture that strong reproducibile is a sufficient terminating condition. 2012-08-09T19:24:12Z 2012-08-09T19:24:12Z 1997-09 Thesis http://hdl.handle.net/10945/9065 eng This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. As such, it is in the public domain, and under the provisions of Title 17, United States Code, Section 105, it may not be copyrighted. Monterey, California. Naval Postgraduate School
collection NDLTD
language English
sources NDLTD
description Approved for public release; distribution is unlimited === Heuristic methods of solving exploratory data analysis problems suffer from one major weakness - uncertainty regarding the optimality of the results. The developers of DaMI (Data Mining Initiative), a genetic algorithm designed to mine the CCEP (Comprehensive Clinical Evaluation Program) database in the search for a Persian Gulf War syndrome, proposed a method to overcome this weakness: reproducibility -- the conjecture that consistent convergence on the same solutions is both necessary and sufficient to ensure a genetic algorithm has effectively searched an unknown solution space. We demonstrate the weakness of this conjecture in light of accepted genetic algorithm theory. We then test the conjecture by modifying the CCEP database with the insertion of an interesting solution of known quality and performing a discovery session using DaMI on this modified database. The necessity of reproducibility as a terminating condition is falsified by the algorithm finding the optimal solution without yielding strong reproducibility. The sufficiency of reproducibility as a terminating condition is analyzed by manual examination of the CCEP database in which strong reproducibility was experienced. Ex post facto knowledge of the solution space is used to prove that DaMI had not found the optimal solutions though it gave strong reproducibility, causing us to reject the conjecture that strong reproducibile is a sufficient terminating condition.
author2 Bhargava, Hemant K.
author_facet Bhargava, Hemant K.
Carter, Jason W.
author Carter, Jason W.
spellingShingle Carter, Jason W.
Testing effectiveness of genetic algorithms for exploratory data analysis
author_sort Carter, Jason W.
title Testing effectiveness of genetic algorithms for exploratory data analysis
title_short Testing effectiveness of genetic algorithms for exploratory data analysis
title_full Testing effectiveness of genetic algorithms for exploratory data analysis
title_fullStr Testing effectiveness of genetic algorithms for exploratory data analysis
title_full_unstemmed Testing effectiveness of genetic algorithms for exploratory data analysis
title_sort testing effectiveness of genetic algorithms for exploratory data analysis
publisher Monterey, California. Naval Postgraduate School
publishDate 2012
url http://hdl.handle.net/10945/9065
work_keys_str_mv AT carterjasonw testingeffectivenessofgeneticalgorithmsforexploratorydataanalysis
_version_ 1716805671603666944