Tools for Automatic Language Classification

碩士 === 國立中正大學 === 資訊工程所 === 98 === With the globalization of information, communication between countries or cultures is frequent. Now, language/encoding classification is presented almost everywhere. From text editor to web browser, mail system, and information retrieval, language/encoding classifi...

Full description

Bibliographic Details
Main Authors: Nai-fan Hsiao, 蕭乃凡
Other Authors: Sun Wu
Format: Others
Language:zh-TW
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/92126373168389321757
id ndltd-TW-098CCU05392063
record_format oai_dc
spelling ndltd-TW-098CCU053920632015-10-13T18:25:32Z http://ndltd.ncl.edu.tw/handle/92126373168389321757 Tools for Automatic Language Classification 自動語系分類之工具 Nai-fan Hsiao 蕭乃凡 碩士 國立中正大學 資訊工程所 98 With the globalization of information, communication between countries or cultures is frequent. Now, language/encoding classification is presented almost everywhere. From text editor to web browser, mail system, and information retrieval, language/encoding classification is a small but important tool in computer science. In this thesis we develop a language/encoding classification tool. The classification method contains encoding scheme check, statistical analysis of high frequency terms and Unicode encoding table lookup. TFIDF data-training technique, multi-pattern matching and weighted scoring mechanism are adopted. Besides, this tool is implemented as a network service, providing remote access for distributed computing environment. Sun Wu 吳昇 2010 學位論文 ; thesis 38 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立中正大學 === 資訊工程所 === 98 === With the globalization of information, communication between countries or cultures is frequent. Now, language/encoding classification is presented almost everywhere. From text editor to web browser, mail system, and information retrieval, language/encoding classification is a small but important tool in computer science. In this thesis we develop a language/encoding classification tool. The classification method contains encoding scheme check, statistical analysis of high frequency terms and Unicode encoding table lookup. TFIDF data-training technique, multi-pattern matching and weighted scoring mechanism are adopted. Besides, this tool is implemented as a network service, providing remote access for distributed computing environment.
author2 Sun Wu
author_facet Sun Wu
Nai-fan Hsiao
蕭乃凡
author Nai-fan Hsiao
蕭乃凡
spellingShingle Nai-fan Hsiao
蕭乃凡
Tools for Automatic Language Classification
author_sort Nai-fan Hsiao
title Tools for Automatic Language Classification
title_short Tools for Automatic Language Classification
title_full Tools for Automatic Language Classification
title_fullStr Tools for Automatic Language Classification
title_full_unstemmed Tools for Automatic Language Classification
title_sort tools for automatic language classification
publishDate 2010
url http://ndltd.ncl.edu.tw/handle/92126373168389321757
work_keys_str_mv AT naifanhsiao toolsforautomaticlanguageclassification
AT xiāonǎifán toolsforautomaticlanguageclassification
AT naifanhsiao zìdòngyǔxìfēnlèizhīgōngjù
AT xiāonǎifán zìdòngyǔxìfēnlèizhīgōngjù
_version_ 1718032530323537920