A Preliminary Study of Text Retrieval Techniques Utilizing Character/Word Positions
碩士 === 國立臺灣大學 === 資訊管理研究所 === 88 === Text retrieval systems can be roughly categorized into two types : (a) systems with characters/words as index terms and (b) systems with phrases as index terms. The first type of systems can be implemented automatically. However, their retrieval results usually h...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2000
|
Online Access: | http://ndltd.ncl.edu.tw/handle/45115638176919765401 |
id |
ndltd-TW-088NTU00396018 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-088NTU003960182016-01-29T04:18:37Z http://ndltd.ncl.edu.tw/handle/45115638176919765401 A Preliminary Study of Text Retrieval Techniques Utilizing Character/Word Positions 運用字詞位置的文件檢索技術初探 Lung-Chi Lin 林隆祺 碩士 國立臺灣大學 資訊管理研究所 88 Text retrieval systems can be roughly categorized into two types : (a) systems with characters/words as index terms and (b) systems with phrases as index terms. The first type of systems can be implemented automatically. However, their retrieval results usually have high recall but low precision. The second type of systems can achieve high precision, but require human intervention in selecting key phrases of a document. Moreover, they have to deal with the phrase segmentation problem when handling a query. We seek a retrieval method that can achieve both high recall and high precision and also can be implemented automatically. This thesis proposes such a method that utilizes character/word positions. Though our method is suitable for Chinese/English text retrieval, we focus on its use in Chinese text retrieval. The main idea is to record the position of a character/word in the index. This extra information is then used to compute the similarity between the query and a stored document. We conduct a preliminary but systematic study of the algorithms for determining similarity that utilize a character/word index and we show by experiments that such algorithms do produce good retrieval results. Yih-Kuen Tsay 蔡益坤 2000 學位論文 ; thesis 74 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣大學 === 資訊管理研究所 === 88 === Text retrieval systems can be roughly categorized into two types : (a) systems with characters/words as index terms and (b) systems with phrases as index terms. The first type of systems can be implemented automatically. However, their retrieval results usually have high recall but low precision. The second type of systems can achieve high precision, but require human intervention in selecting key phrases of a document. Moreover, they have to deal with the phrase segmentation problem when handling a query. We seek a retrieval method that can achieve both high recall and high precision and also can be implemented automatically.
This thesis proposes such a method that utilizes character/word positions. Though our method is suitable for Chinese/English text retrieval, we focus on its use in Chinese text retrieval. The main idea is to record the position of a character/word in the index. This extra information is then used to compute the similarity between the query and a stored document. We conduct a preliminary but systematic study of the algorithms for determining similarity that utilize a character/word index and we show by experiments that such algorithms do produce good retrieval results.
|
author2 |
Yih-Kuen Tsay |
author_facet |
Yih-Kuen Tsay Lung-Chi Lin 林隆祺 |
author |
Lung-Chi Lin 林隆祺 |
spellingShingle |
Lung-Chi Lin 林隆祺 A Preliminary Study of Text Retrieval Techniques Utilizing Character/Word Positions |
author_sort |
Lung-Chi Lin |
title |
A Preliminary Study of Text Retrieval Techniques Utilizing Character/Word Positions |
title_short |
A Preliminary Study of Text Retrieval Techniques Utilizing Character/Word Positions |
title_full |
A Preliminary Study of Text Retrieval Techniques Utilizing Character/Word Positions |
title_fullStr |
A Preliminary Study of Text Retrieval Techniques Utilizing Character/Word Positions |
title_full_unstemmed |
A Preliminary Study of Text Retrieval Techniques Utilizing Character/Word Positions |
title_sort |
preliminary study of text retrieval techniques utilizing character/word positions |
publishDate |
2000 |
url |
http://ndltd.ncl.edu.tw/handle/45115638176919765401 |
work_keys_str_mv |
AT lungchilin apreliminarystudyoftextretrievaltechniquesutilizingcharacterwordpositions AT línlóngqí apreliminarystudyoftextretrievaltechniquesutilizingcharacterwordpositions AT lungchilin yùnyòngzìcíwèizhìdewénjiànjiǎnsuǒjìshùchūtàn AT línlóngqí yùnyòngzìcíwèizhìdewénjiànjiǎnsuǒjìshùchūtàn AT lungchilin preliminarystudyoftextretrievaltechniquesutilizingcharacterwordpositions AT línlóngqí preliminarystudyoftextretrievaltechniquesutilizingcharacterwordpositions |
_version_ |
1718167383122640896 |