Personalized Document Clustering: Technique Development and Empirical Evaluation

碩士 === 國立中山大學 === 資訊管理學系研究所 === 91 === With the proliferation of an electronic commerce and knowledge economy environment, both organizations and individuals generate and consume a large amount of online information, typically available as textual documents. To manage the ever-increasing volume of d...

Full description

Bibliographic Details
Main Authors: Chia-Chen Wu, 吳佳真
Other Authors: Chih-Ping Wei
Format: Others
Language:en_US
Published: 2003
Online Access:http://ndltd.ncl.edu.tw/handle/64374953260629343585
id ndltd-TW-091NSYS5396073
record_format oai_dc
spelling ndltd-TW-091NSYS53960732016-06-22T04:20:46Z http://ndltd.ncl.edu.tw/handle/64374953260629343585 Personalized Document Clustering: Technique Development and Empirical Evaluation 個人化文件分群:技術發展與實證評估 Chia-Chen Wu 吳佳真 碩士 國立中山大學 資訊管理學系研究所 91 With the proliferation of an electronic commerce and knowledge economy environment, both organizations and individuals generate and consume a large amount of online information, typically available as textual documents. To manage the ever-increasing volume of documents, organizations and individuals typically organize their documents into categories to facilitate document management and subsequent information access and browsing. However, document grouping behaviors are intentional acts, reflecting individuals’ (or organizations’) preferential perspective on semantic coherency or relevant groupings between subjects. Thus, an effective document clustering needs to address the described preferential perspective on document grouping and support personalized document clustering. In this thesis, we designed and implemented a personalized document clustering approach by incorporating individual’s partial clustering into the document clustering process. Combining two document representation methods (i.e., feature refinement and feature weighting) with two clustering processes (i.e., pre-cluster-based and atomic-based), four personalized document clustering techniques are proposed. Using the clustering effectiveness achieved by a traditional content-based document clustering technique as performance benchmarks, our evaluation results suggest that use of partial clusters would improve the document clustering effectiveness. Moreover, the pre-cluster-based technique outperforms the atomic-based one, and the feature weighting method for document representation achieves a higher clustering effectiveness than the feature refinement method does. Chih-Ping Wei 魏志平 2003 學位論文 ; thesis 51 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立中山大學 === 資訊管理學系研究所 === 91 === With the proliferation of an electronic commerce and knowledge economy environment, both organizations and individuals generate and consume a large amount of online information, typically available as textual documents. To manage the ever-increasing volume of documents, organizations and individuals typically organize their documents into categories to facilitate document management and subsequent information access and browsing. However, document grouping behaviors are intentional acts, reflecting individuals’ (or organizations’) preferential perspective on semantic coherency or relevant groupings between subjects. Thus, an effective document clustering needs to address the described preferential perspective on document grouping and support personalized document clustering. In this thesis, we designed and implemented a personalized document clustering approach by incorporating individual’s partial clustering into the document clustering process. Combining two document representation methods (i.e., feature refinement and feature weighting) with two clustering processes (i.e., pre-cluster-based and atomic-based), four personalized document clustering techniques are proposed. Using the clustering effectiveness achieved by a traditional content-based document clustering technique as performance benchmarks, our evaluation results suggest that use of partial clusters would improve the document clustering effectiveness. Moreover, the pre-cluster-based technique outperforms the atomic-based one, and the feature weighting method for document representation achieves a higher clustering effectiveness than the feature refinement method does.
author2 Chih-Ping Wei
author_facet Chih-Ping Wei
Chia-Chen Wu
吳佳真
author Chia-Chen Wu
吳佳真
spellingShingle Chia-Chen Wu
吳佳真
Personalized Document Clustering: Technique Development and Empirical Evaluation
author_sort Chia-Chen Wu
title Personalized Document Clustering: Technique Development and Empirical Evaluation
title_short Personalized Document Clustering: Technique Development and Empirical Evaluation
title_full Personalized Document Clustering: Technique Development and Empirical Evaluation
title_fullStr Personalized Document Clustering: Technique Development and Empirical Evaluation
title_full_unstemmed Personalized Document Clustering: Technique Development and Empirical Evaluation
title_sort personalized document clustering: technique development and empirical evaluation
publishDate 2003
url http://ndltd.ncl.edu.tw/handle/64374953260629343585
work_keys_str_mv AT chiachenwu personalizeddocumentclusteringtechniquedevelopmentandempiricalevaluation
AT wújiāzhēn personalizeddocumentclusteringtechniquedevelopmentandempiricalevaluation
AT chiachenwu gèrénhuàwénjiànfēnqúnjìshùfāzhǎnyǔshízhèngpínggū
AT wújiāzhēn gèrénhuàwénjiànfēnqúnjìshùfāzhǎnyǔshízhèngpínggū
_version_ 1718318508107890688