Mapping Algorithms for KNN Applications with Categorical Data

碩士 === 國立嘉義大學 === 資訊工程研究所 === 92 === In this paper, we present a novel method to transform data to speed up processing of high-dimensional K-nearest neighbor queries in index data environment. The transform method can prove that similarity in each attribute have really property, and searc...

Full description

Bibliographic Details
Main Authors: Yi-Sen Lin, 林奕森
Other Authors: 郭煌政
Format: Others
Language:zh-TW
Published: 2004
Online Access:http://ndltd.ncl.edu.tw/handle/26227870786117627780
id ndltd-TW-092NCYU0392016
record_format oai_dc
spelling ndltd-TW-092NCYU03920162016-06-17T04:16:05Z http://ndltd.ncl.edu.tw/handle/26227870786117627780 Mapping Algorithms for KNN Applications with Categorical Data KNN應用之種類型資料對映演算法 Yi-Sen Lin 林奕森 碩士 國立嘉義大學 資訊工程研究所 92 In this paper, we present a novel method to transform data to speed up processing of high-dimensional K-nearest neighbor queries in index data environment. The transform method can prove that similarity in each attribute have really property, and search space more efficiently as the reduced dimensions. Memory-Based Reasoning is a useful data mining technique that deals with different attribute such as categorical or numeric values. In this paper, we present a novel method to map data to speed up processing of multi-dimensional K-nearest neighbor queries in index data environment. Due to MBR must calculate the target attribute value with all training dataset, it is very time consuming to obtain a result so we must to build the index framework. However, in the training dataset, the input attributes are categorical and numeric. Multi-dimensional index framework cannot handle categorical values well. So, we must convert categorical attribute into numeric. The mapping algorithm should preserve the distance relationship among categories of an attribute as much as possible. We use a real-life dataset for approximate K nearest neighbor searching. The experiment result shows that our algorithm has good accuracy. 郭煌政 2004 學位論文 ; thesis 49 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立嘉義大學 === 資訊工程研究所 === 92 === In this paper, we present a novel method to transform data to speed up processing of high-dimensional K-nearest neighbor queries in index data environment. The transform method can prove that similarity in each attribute have really property, and search space more efficiently as the reduced dimensions. Memory-Based Reasoning is a useful data mining technique that deals with different attribute such as categorical or numeric values. In this paper, we present a novel method to map data to speed up processing of multi-dimensional K-nearest neighbor queries in index data environment. Due to MBR must calculate the target attribute value with all training dataset, it is very time consuming to obtain a result so we must to build the index framework. However, in the training dataset, the input attributes are categorical and numeric. Multi-dimensional index framework cannot handle categorical values well. So, we must convert categorical attribute into numeric. The mapping algorithm should preserve the distance relationship among categories of an attribute as much as possible. We use a real-life dataset for approximate K nearest neighbor searching. The experiment result shows that our algorithm has good accuracy.
author2 郭煌政
author_facet 郭煌政
Yi-Sen Lin
林奕森
author Yi-Sen Lin
林奕森
spellingShingle Yi-Sen Lin
林奕森
Mapping Algorithms for KNN Applications with Categorical Data
author_sort Yi-Sen Lin
title Mapping Algorithms for KNN Applications with Categorical Data
title_short Mapping Algorithms for KNN Applications with Categorical Data
title_full Mapping Algorithms for KNN Applications with Categorical Data
title_fullStr Mapping Algorithms for KNN Applications with Categorical Data
title_full_unstemmed Mapping Algorithms for KNN Applications with Categorical Data
title_sort mapping algorithms for knn applications with categorical data
publishDate 2004
url http://ndltd.ncl.edu.tw/handle/26227870786117627780
work_keys_str_mv AT yisenlin mappingalgorithmsforknnapplicationswithcategoricaldata
AT línyìsēn mappingalgorithmsforknnapplicationswithcategoricaldata
AT yisenlin knnyīngyòngzhīzhǒnglèixíngzīliàoduìyìngyǎnsuànfǎ
AT línyìsēn knnyīngyòngzhīzhǒnglèixíngzīliàoduìyìngyǎnsuànfǎ
_version_ 1718307144128790528