Summary: | 碩士 === 東海大學 === 工業工程與經營資訊學系 === 93 === Traditional file classification mostly depends on people to deal with it. However, with the growth of enterprises and progress of information technology, the various kinds of electronic documents increase increasingly. It is hard to classify the documents with traditional method. Hence, how to utilize automatic technology to effectively deal with the classifying of large amount of document is became the trend in the future. The traditional automatic classification model must look for the document, having the classified characteristic, to regard as training data, and it will simulate the artificial classification model. Then, the system were be determined its correct rate and feasibility of the model by the testing data. As the documents with non-classification characteristic usually use the method of full text retrieval.
However, if the idea of document classification can be joined while searching the document, then, to define each document into specific category; and to retrieve the highly correlative classification of document. By this way, the drawbacks of large retrieved documents and the shortcoming of non- objective mechanism of retrieving document for user’s reference can be arrived, and the correct rate of retrieval can be raised.
This study takes the R&D document of IC manufacturing industry as example, and proposes a process-oriented document classification framework to improve the method taking project dimension or document dimension as classifying basic. Then, the method of system analysis is used to analyze the demand of document classification and retrieval system. Finally, for enhancing the capability of retrieving document, the document classification and retrieval prototype system with Vector-Space Model will be built in this study.
|