Personalized Document Clustering: Technique Development and Empirical Evaluation
碩士 === 國立中山大學 === 資訊管理學系研究所 === 91 === With the proliferation of an electronic commerce and knowledge economy environment, both organizations and individuals generate and consume a large amount of online information, typically available as textual documents. To manage the ever-increasing volume of d...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2003
|
Online Access: | http://ndltd.ncl.edu.tw/handle/64374953260629343585 |
id |
ndltd-TW-091NSYS5396073 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-091NSYS53960732016-06-22T04:20:46Z http://ndltd.ncl.edu.tw/handle/64374953260629343585 Personalized Document Clustering: Technique Development and Empirical Evaluation 個人化文件分群:技術發展與實證評估 Chia-Chen Wu 吳佳真 碩士 國立中山大學 資訊管理學系研究所 91 With the proliferation of an electronic commerce and knowledge economy environment, both organizations and individuals generate and consume a large amount of online information, typically available as textual documents. To manage the ever-increasing volume of documents, organizations and individuals typically organize their documents into categories to facilitate document management and subsequent information access and browsing. However, document grouping behaviors are intentional acts, reflecting individuals’ (or organizations’) preferential perspective on semantic coherency or relevant groupings between subjects. Thus, an effective document clustering needs to address the described preferential perspective on document grouping and support personalized document clustering. In this thesis, we designed and implemented a personalized document clustering approach by incorporating individual’s partial clustering into the document clustering process. Combining two document representation methods (i.e., feature refinement and feature weighting) with two clustering processes (i.e., pre-cluster-based and atomic-based), four personalized document clustering techniques are proposed. Using the clustering effectiveness achieved by a traditional content-based document clustering technique as performance benchmarks, our evaluation results suggest that use of partial clusters would improve the document clustering effectiveness. Moreover, the pre-cluster-based technique outperforms the atomic-based one, and the feature weighting method for document representation achieves a higher clustering effectiveness than the feature refinement method does. Chih-Ping Wei 魏志平 2003 學位論文 ; thesis 51 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立中山大學 === 資訊管理學系研究所 === 91 === With the proliferation of an electronic commerce and knowledge economy environment, both organizations and individuals generate and consume a large amount of online information, typically available as textual documents. To manage the ever-increasing volume of documents, organizations and individuals typically organize their documents into categories to facilitate document management and subsequent information access and browsing. However, document grouping behaviors are intentional acts, reflecting individuals’ (or organizations’) preferential perspective on semantic coherency or relevant groupings between subjects. Thus, an effective document clustering needs to address the described preferential perspective on document grouping and support personalized document clustering. In this thesis, we designed and implemented a personalized document clustering approach by incorporating individual’s partial clustering into the document clustering process. Combining two document representation methods (i.e., feature refinement and feature weighting) with two clustering processes (i.e., pre-cluster-based and atomic-based), four personalized document clustering techniques are proposed. Using the clustering effectiveness achieved by a traditional content-based document clustering technique as performance benchmarks, our evaluation results suggest that use of partial clusters would improve the document clustering effectiveness. Moreover, the pre-cluster-based technique outperforms the atomic-based one, and the feature weighting method for document representation achieves a higher clustering effectiveness than the feature refinement method does.
|
author2 |
Chih-Ping Wei |
author_facet |
Chih-Ping Wei Chia-Chen Wu 吳佳真 |
author |
Chia-Chen Wu 吳佳真 |
spellingShingle |
Chia-Chen Wu 吳佳真 Personalized Document Clustering: Technique Development and Empirical Evaluation |
author_sort |
Chia-Chen Wu |
title |
Personalized Document Clustering: Technique Development and Empirical Evaluation |
title_short |
Personalized Document Clustering: Technique Development and Empirical Evaluation |
title_full |
Personalized Document Clustering: Technique Development and Empirical Evaluation |
title_fullStr |
Personalized Document Clustering: Technique Development and Empirical Evaluation |
title_full_unstemmed |
Personalized Document Clustering: Technique Development and Empirical Evaluation |
title_sort |
personalized document clustering: technique development and empirical evaluation |
publishDate |
2003 |
url |
http://ndltd.ncl.edu.tw/handle/64374953260629343585 |
work_keys_str_mv |
AT chiachenwu personalizeddocumentclusteringtechniquedevelopmentandempiricalevaluation AT wújiāzhēn personalizeddocumentclusteringtechniquedevelopmentandempiricalevaluation AT chiachenwu gèrénhuàwénjiànfēnqúnjìshùfāzhǎnyǔshízhèngpínggū AT wújiāzhēn gèrénhuàwénjiànfēnqúnjìshùfāzhǎnyǔshízhèngpínggū |
_version_ |
1718318508107890688 |