Research on distributed data mining system and algorithm based on multi-agent

Data mining means extracting hidden, previous unknown knowledge and rules with potential value to decision from mass data in database. Association rule mining is a main researching area of data mining area, which is widely used in practice. With the development of network technology and the improvem...

Full description

Bibliographic Details
Main Author: Jiang, Lingxia
Format: Others
Language:en
Published: 2009
Subjects:
Online Access:http://constellation.uqac.ca/137/1/030120838.pdf
id ndltd-Quebec-oai-constellation.uqac.ca-137
record_format oai_dc
spelling ndltd-Quebec-oai-constellation.uqac.ca-1372017-07-20T17:50:38Z http://constellation.uqac.ca/137/ Research on distributed data mining system and algorithm based on multi-agent Jiang, Lingxia Informatique Data mining means extracting hidden, previous unknown knowledge and rules with potential value to decision from mass data in database. Association rule mining is a main researching area of data mining area, which is widely used in practice. With the development of network technology and the improvement of level of IT application, distributed database is commonly used. Distributed data mining is mining overall knowledge which is useful for management and decision from database distributed in geography. It has become an important issue in data mining analysis. Distributed data mining can achieve a mining task with computers in different site on the internet. It can not only improve the mining efficiency, reduce the transmitting amount of network data, but is also good for security and privacy of data. Based on related theories and current research situation of data mining and distributed data mining, this thesis will focus on analysis on the structure of distributed mining system and distributed association rule mining algorithm. This thesis first raises a structure of distributed data mining system which is base on multi-agent. It adopts star network topology, and realize distributed saving mass data mining with multi-agent. Based on raised distributed data mining system, this these brings about a new distributed association rule mining algorithm?RK-tree algorithm. RK-tree algorithm is based on the basic theory of twice knowledge combination. Each sub-site point first mines local frequency itemset from local database, then send the mined local frequency itemset to the main site point. The main site point combines those local frequency itemset and get overall candidate frequency itemset, and send the obtained overall candidate frequency itemset to each sub-site point. Each sub-site point count the supporting rate of those overall candidate frequency itemset and sent it back to the main site point. At last, the main site point combines the results sent by sub-site point and gets the overall frequency itemset and overall association rule. This algorithm just needs three times communication between the main and sub-site points, which greatly reduces the amount and times of communication, and improves the efficiency of selection. What's more, each sub-site point can fully use existing good centralized association rule mining algorithm to realize local association rule mining, which can enable them to obtain better local data mining efficiency, as well as reduce the workload. This algorithm is simple and easy to realize. The last part of this thesis is the conclusion of the analysis, as well as the direction of further research. 2009 Thèse ou mémoire de l'UQAC NonPeerReviewed application/pdf en http://constellation.uqac.ca/137/1/030120838.pdf Jiang Lingxia. (2009). Research on distributed data mining system and algorithm based on multi-agent. Mémoire de maîtrise, Université du Québec à Chicoutimi. doi:10.1522/030120838
collection NDLTD
language en
format Others
sources NDLTD
topic Informatique
spellingShingle Informatique
Jiang, Lingxia
Research on distributed data mining system and algorithm based on multi-agent
description Data mining means extracting hidden, previous unknown knowledge and rules with potential value to decision from mass data in database. Association rule mining is a main researching area of data mining area, which is widely used in practice. With the development of network technology and the improvement of level of IT application, distributed database is commonly used. Distributed data mining is mining overall knowledge which is useful for management and decision from database distributed in geography. It has become an important issue in data mining analysis. Distributed data mining can achieve a mining task with computers in different site on the internet. It can not only improve the mining efficiency, reduce the transmitting amount of network data, but is also good for security and privacy of data. Based on related theories and current research situation of data mining and distributed data mining, this thesis will focus on analysis on the structure of distributed mining system and distributed association rule mining algorithm. This thesis first raises a structure of distributed data mining system which is base on multi-agent. It adopts star network topology, and realize distributed saving mass data mining with multi-agent. Based on raised distributed data mining system, this these brings about a new distributed association rule mining algorithm?RK-tree algorithm. RK-tree algorithm is based on the basic theory of twice knowledge combination. Each sub-site point first mines local frequency itemset from local database, then send the mined local frequency itemset to the main site point. The main site point combines those local frequency itemset and get overall candidate frequency itemset, and send the obtained overall candidate frequency itemset to each sub-site point. Each sub-site point count the supporting rate of those overall candidate frequency itemset and sent it back to the main site point. At last, the main site point combines the results sent by sub-site point and gets the overall frequency itemset and overall association rule. This algorithm just needs three times communication between the main and sub-site points, which greatly reduces the amount and times of communication, and improves the efficiency of selection. What's more, each sub-site point can fully use existing good centralized association rule mining algorithm to realize local association rule mining, which can enable them to obtain better local data mining efficiency, as well as reduce the workload. This algorithm is simple and easy to realize. The last part of this thesis is the conclusion of the analysis, as well as the direction of further research.
author Jiang, Lingxia
author_facet Jiang, Lingxia
author_sort Jiang, Lingxia
title Research on distributed data mining system and algorithm based on multi-agent
title_short Research on distributed data mining system and algorithm based on multi-agent
title_full Research on distributed data mining system and algorithm based on multi-agent
title_fullStr Research on distributed data mining system and algorithm based on multi-agent
title_full_unstemmed Research on distributed data mining system and algorithm based on multi-agent
title_sort research on distributed data mining system and algorithm based on multi-agent
publishDate 2009
url http://constellation.uqac.ca/137/1/030120838.pdf
work_keys_str_mv AT jianglingxia researchondistributeddataminingsystemandalgorithmbasedonmultiagent
_version_ 1718501834501390336