sing the Support Vector Machine to classify the Chinese text readability – A Case of Elementary Chinese Textbook

碩士 === 國立臺灣師範大學 === 資訊教育學系 === 99 === Language plays an important part in every reign. And the most efficient way to enhance our ability is to read. Readability can estimate whether an article is suitable for one reader. Past researches claim that readability is a mean to adjust the level of articl...

Full description

Bibliographic Details
Main Author: 胡夢珂
Other Authors: 張國恩老師
Format: Others
Language:zh-TW
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/05892956462226005886
Description
Summary:碩士 === 國立臺灣師範大學 === 資訊教育學系 === 99 === Language plays an important part in every reign. And the most efficient way to enhance our ability is to read. Readability can estimate whether an article is suitable for one reader. Past researches claim that readability is a mean to adjust the level of article according to different kinds of educational attainment. The research of English readability has been on its way while Chinese has a little progression. However, Chinese is a trend in nowadays. It is important to find a suitable way to classify text readability. In the past researches, many western readability formulas do to the lack of technology use linear models on text classification, and linear readability formulas is a limit for the data in my research. Therefore, the purpose of this research is to use the predict model, which trained by the support vector machine, to classify the elementary Chinese textbook’s readability. And to check up that whether the text is matched with the predict text. At last, analyze the wrong text to improve the accuracy of text readability. This research was compiled by course expert and the experience materials( from first to sixth grades deleting the classical Chinese texts of three vision texts of private publish enterprise including vision H, K, and N) total 386 texts were examined by the national compilation organization. Part of the texts are used as training materials and the others are testing materials. Through the Chinese Word Segmentation processing and data format conversion, we at last do the text classification by SVM. The research conclusion is that the accuracy of predicting elementary texts is 47.92% while the fit rate is 80.31%. At the end, analyze the wrong prediction and understand the reason of this wrong prediction.