Summary: | 碩士 === 淡江大學 === 資訊工程學系碩士班 === 93 === By using feature keywords, we can obtain some appropriate rules from a group of labeled documents. According to this way, we can classify the documents which haven’t been labeled. In this paper, we will discuss how to choose some training datum to be a basic, to calculate all keywords’ weights, to judge the keywords’ importance by their distribution, and to solve the problems of keywords’ correlation.
We will try to solve to avoid the relation of keywords efficiently and filter the noise. So, we use decision tree to solve relative problems, because it can ignore the relation from word to words in first step. Second, we use the two-phase local feature to reduce amount of noisy. In chapter 4 we can observe the results that are more efficiency than before.
|