Summary: | 碩士 === 國立中正大學 === 資訊工程研究所 === 90 === With the rapid growth of Internet, Web News becomes more and more popular. And many people have changed the habit of reading News, instead of reading News from newspapers or TV channel they read News from WWW now. But most News web sites are short of automated process, they need to invoke human effort to classify or select their News in the news page. Therefore, our system was built to solve these problems, and it can fetch News data, classifies and clusters News content automatically. Our system will reduce the human effort invoked and increase the efficiency of the process from getting the News to showing the result.
In our thesis, we utilize the HTML format to make the result of the classification more accurate. We define the similarity between documents, and use the k-means algorithm to make the cluster process more efficient. Our system also provides two kinds of User Interfaces to make users reading News more efficiently and more conveniently.
|