A Study on High-Performance Documents Classification System Based on Ontology and Vector Space Model

碩士 === 南台科技大學 === 資訊管理系 === 93 === ABSTRACT With rapid growth in computer networks, users must spend a lot of time to retrieve the desired information or receive messages from the Internet. Therefore a high performance document classification system is required to retrieve the information needed by...

Full description

Bibliographic Details
Main Authors: Jiang Jing Yi, 江靜宜
Other Authors: 張儀興
Format: Others
Language:zh-TW
Published: 2005
Online Access:http://ndltd.ncl.edu.tw/handle/62228557583597973074
Description
Summary:碩士 === 南台科技大學 === 資訊管理系 === 93 === ABSTRACT With rapid growth in computer networks, users must spend a lot of time to retrieve the desired information or receive messages from the Internet. Therefore a high performance document classification system is required to retrieve the information needed by users. In this thesis, a high performance document classification system, HPDCS, based on OTVM structure is proposed combining the ontology with Vector Space Model. The main concept is to utilize the property of the vector space model offering many kinds of weights to increase the speed of information retrieval, and the concept hierarchy of ontology offering the intact glossary of document classification. We compare the OTVM with vector Space Model (VSM), and the following two results are obtained: the accuracy of documents classification of OTVM is promoted 12.1% than VSM; when the vector glossary is added into OTVM, the accuracy of documents classification of it is promoted up to 18.3% than VSM. In summary, the HPDCS can really promote the performance of documents classification, that is, it can retrieve these documents wanted by users more fast and accurately.