Summary: | 碩士 === 南台科技大學 === 資訊管理系 === 101 === With the increasing development of Internet technology is popularization and used by all walks of life, people had change the mainly method gradually of obtain and disseminate information through books and television or other media in past, instead of and to be used to obtain and disseminate information from Internet or search engine. It's promoting the information on the Internet increasing quickly and sharply ontinually day by day. It's an important issue in modern to extraction information quickly and suitable from the Internet in the situation with huge data by used search engine. Therefore, such as search engines build and optimization, network information exploration, extraction, classification and summaries technology, are an important research topic for scholars in recent years. And the most critical and necessary technology in these studies is about information classification. Accurate and effective classification techniques is helpful of analyze and speculate real needs of users to provide appropriate classification and meet users' needs.
And the most important part of classification technology is how to make it automatic or semi-automatic to build correct and identification Terminology. Human knowledge is redouble as speed up of information flow, Human knowledge is redouble as speed up of information flow, new Terminology in new knowledge is constantly being made. The man-made solution to builded Terminology Table in past, take high cost of builded and maintain even slower, fall behind the increase of knowledge far away. Therefore, it's so hard to get the desired information accurate and fast from huge and rapid increase knowledge. Direction and purpose of this study is mainly for this issue, to study the existing mainly classification technology proposed by scholars, experiment and evaluate their strengths and weaknesses, and to deduced TF-ITF algorithm that can be used to extraction Terminology from known knowledge field. The present study proved TF-ITF can extraction large number Terminology effectively from even unknown or known knowledge field, significantly reduce the cost and time, And thus can help IT more efficiently and improve classification accuracy.
|