Thesaurus Extraction From the World Wide Web

碩士 === 國立中正大學 === 資訊工程研究所 === 91 === As the amount of data grows in WWW, there are more and more researches to extract valuable information from the web. In this thesis, we will present an automatic thesaurus extraction system from the WWW. The system used two thesaurus extraction methods...

Full description

Bibliographic Details
Main Authors: Yi-Min Shih, 石逸民
Other Authors: Sun Wu
Format: Others
Language:zh-TW
Online Access:http://ndltd.ncl.edu.tw/handle/06154737758604082281
id ndltd-TW-091CCU00392123
record_format oai_dc
spelling ndltd-TW-091CCU003921232016-06-24T04:15:54Z http://ndltd.ncl.edu.tw/handle/06154737758604082281 Thesaurus Extraction From the World Wide Web 從全球資訊網擷取同義詞 Yi-Min Shih 石逸民 碩士 國立中正大學 資訊工程研究所 91 As the amount of data grows in WWW, there are more and more researches to extract valuable information from the web. In this thesis, we will present an automatic thesaurus extraction system from the WWW. The system used two thesaurus extraction methods. In the first method, we base on the writing practice and extract contents from web page. Then we extract candidates of thesauruses from web contents by some syntactic analysis. We will merge these candidates and reduce the noise of thesauruses and produce a thesaurus dictionary. In the second method, we analyze the anchor text from web and produce web site alias and abbreviation. We also collect data from the web site which one in BIG5 and GB code, and extract the relation of simplified Chinese and standardized Chinese phrase. We can use thesaurus dictionary increase the search result and make the results more precisely. Sun Wu 吳昇 學位論文 ; thesis 42 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立中正大學 === 資訊工程研究所 === 91 === As the amount of data grows in WWW, there are more and more researches to extract valuable information from the web. In this thesis, we will present an automatic thesaurus extraction system from the WWW. The system used two thesaurus extraction methods. In the first method, we base on the writing practice and extract contents from web page. Then we extract candidates of thesauruses from web contents by some syntactic analysis. We will merge these candidates and reduce the noise of thesauruses and produce a thesaurus dictionary. In the second method, we analyze the anchor text from web and produce web site alias and abbreviation. We also collect data from the web site which one in BIG5 and GB code, and extract the relation of simplified Chinese and standardized Chinese phrase. We can use thesaurus dictionary increase the search result and make the results more precisely.
author2 Sun Wu
author_facet Sun Wu
Yi-Min Shih
石逸民
author Yi-Min Shih
石逸民
spellingShingle Yi-Min Shih
石逸民
Thesaurus Extraction From the World Wide Web
author_sort Yi-Min Shih
title Thesaurus Extraction From the World Wide Web
title_short Thesaurus Extraction From the World Wide Web
title_full Thesaurus Extraction From the World Wide Web
title_fullStr Thesaurus Extraction From the World Wide Web
title_full_unstemmed Thesaurus Extraction From the World Wide Web
title_sort thesaurus extraction from the world wide web
url http://ndltd.ncl.edu.tw/handle/06154737758604082281
work_keys_str_mv AT yiminshih thesaurusextractionfromtheworldwideweb
AT shíyìmín thesaurusextractionfromtheworldwideweb
AT yiminshih cóngquánqiúzīxùnwǎngxiéqǔtóngyìcí
AT shíyìmín cóngquánqiúzīxùnwǎngxiéqǔtóngyìcí
_version_ 1718322718563106816