A Preliminary Study of Text Retrieval Techniques Utilizing Character/Word Positions

碩士 === 國立臺灣大學 === 資訊管理研究所 === 88 === Text retrieval systems can be roughly categorized into two types : (a) systems with characters/words as index terms and (b) systems with phrases as index terms. The first type of systems can be implemented automatically. However, their retrieval results usually h...

Full description

Bibliographic Details
Main Authors: Lung-Chi Lin, 林隆祺
Other Authors: Yih-Kuen Tsay
Format: Others
Language:zh-TW
Published: 2000
Online Access:http://ndltd.ncl.edu.tw/handle/45115638176919765401
id ndltd-TW-088NTU00396018
record_format oai_dc
spelling ndltd-TW-088NTU003960182016-01-29T04:18:37Z http://ndltd.ncl.edu.tw/handle/45115638176919765401 A Preliminary Study of Text Retrieval Techniques Utilizing Character/Word Positions 運用字詞位置的文件檢索技術初探 Lung-Chi Lin 林隆祺 碩士 國立臺灣大學 資訊管理研究所 88 Text retrieval systems can be roughly categorized into two types : (a) systems with characters/words as index terms and (b) systems with phrases as index terms. The first type of systems can be implemented automatically. However, their retrieval results usually have high recall but low precision. The second type of systems can achieve high precision, but require human intervention in selecting key phrases of a document. Moreover, they have to deal with the phrase segmentation problem when handling a query. We seek a retrieval method that can achieve both high recall and high precision and also can be implemented automatically. This thesis proposes such a method that utilizes character/word positions. Though our method is suitable for Chinese/English text retrieval, we focus on its use in Chinese text retrieval. The main idea is to record the position of a character/word in the index. This extra information is then used to compute the similarity between the query and a stored document. We conduct a preliminary but systematic study of the algorithms for determining similarity that utilize a character/word index and we show by experiments that such algorithms do produce good retrieval results. Yih-Kuen Tsay 蔡益坤 2000 學位論文 ; thesis 74 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺灣大學 === 資訊管理研究所 === 88 === Text retrieval systems can be roughly categorized into two types : (a) systems with characters/words as index terms and (b) systems with phrases as index terms. The first type of systems can be implemented automatically. However, their retrieval results usually have high recall but low precision. The second type of systems can achieve high precision, but require human intervention in selecting key phrases of a document. Moreover, they have to deal with the phrase segmentation problem when handling a query. We seek a retrieval method that can achieve both high recall and high precision and also can be implemented automatically. This thesis proposes such a method that utilizes character/word positions. Though our method is suitable for Chinese/English text retrieval, we focus on its use in Chinese text retrieval. The main idea is to record the position of a character/word in the index. This extra information is then used to compute the similarity between the query and a stored document. We conduct a preliminary but systematic study of the algorithms for determining similarity that utilize a character/word index and we show by experiments that such algorithms do produce good retrieval results.
author2 Yih-Kuen Tsay
author_facet Yih-Kuen Tsay
Lung-Chi Lin
林隆祺
author Lung-Chi Lin
林隆祺
spellingShingle Lung-Chi Lin
林隆祺
A Preliminary Study of Text Retrieval Techniques Utilizing Character/Word Positions
author_sort Lung-Chi Lin
title A Preliminary Study of Text Retrieval Techniques Utilizing Character/Word Positions
title_short A Preliminary Study of Text Retrieval Techniques Utilizing Character/Word Positions
title_full A Preliminary Study of Text Retrieval Techniques Utilizing Character/Word Positions
title_fullStr A Preliminary Study of Text Retrieval Techniques Utilizing Character/Word Positions
title_full_unstemmed A Preliminary Study of Text Retrieval Techniques Utilizing Character/Word Positions
title_sort preliminary study of text retrieval techniques utilizing character/word positions
publishDate 2000
url http://ndltd.ncl.edu.tw/handle/45115638176919765401
work_keys_str_mv AT lungchilin apreliminarystudyoftextretrievaltechniquesutilizingcharacterwordpositions
AT línlóngqí apreliminarystudyoftextretrievaltechniquesutilizingcharacterwordpositions
AT lungchilin yùnyòngzìcíwèizhìdewénjiànjiǎnsuǒjìshùchūtàn
AT línlóngqí yùnyòngzìcíwèizhìdewénjiànjiǎnsuǒjìshùchūtàn
AT lungchilin preliminarystudyoftextretrievaltechniquesutilizingcharacterwordpositions
AT línlóngqí preliminarystudyoftextretrievaltechniquesutilizingcharacterwordpositions
_version_ 1718167383122640896