Chinese phrase segmentation method of Information Retrieval and Search Engine

碩士 === 樹德科技大學 === 資訊工程學系 === 96 === For most people, the techniques of search engine are both familiar and strange. It is familiar because people keep using it in the network activity. The well-known technology of search engine lets many people research to improve it. But only few people knew how to...

Full description

Bibliographic Details
Main Authors: Dewei Yen, 焉德葳
Other Authors: Chao-Kuei Hung
Format: Others
Language:zh-TW
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/20227612775838533643
Description
Summary:碩士 === 樹德科技大學 === 資訊工程學系 === 96 === For most people, the techniques of search engine are both familiar and strange. It is familiar because people keep using it in the network activity. The well-known technology of search engine lets many people research to improve it. But only few people knew how to establish a search engine. This paper tries to explain the technology of search engine by graphs and examples. These researches present the details of each part by actually creating an open source search engine “Ozearch” as example. This paper also presents an algorithm for segmenting Chinese phrases. It utilizes both the N-gram algorithm and the word-based algorithm to improve precision and recall of the search engine. In this paper, we also find few defect of segmenting Chinese phrases for now and presents workable method to improve it.