Summary: | In recent years, the number of unknown computer vulnerabilities has increased rapidly. It is an important and unsolved problem for analyzing and classifying a large number of vulnerability data timely and accurately. Therefore, this paper proposes a text classification method for computer vulnerability description information based on information entropy and comprehensive?function[(S-C)]feature extraction and combines the averaged one-dependence estimators (AODE) classifier. First, the feature words are extracted by the[S-C]feature extraction method. By combining the comprehensive function[C]of the importance degree between classes and within classes of words, the importance degree of words to classes is calculated. Then, the information entropy[S]of words to classes is used to weaken the importance of words with chaotic classification and an accurate feature set is selected. Finally, the vulnerability data set is classified by using AODE which relates the relationship between feature word sets. The experimental comparison shows that the[S-C]feature extraction method can extract the accurate feature word set, and the classification accuracy combined with AODE classifier is higher than traditional classifier model.
|