Passage Retrieval Using Latent Semantics Indexing
碩士 === 國立臺灣大學 === 資訊管理學系 === 85 === With the development of information technology and the increasing ofinformation flow, everyone has to face more and more information. Therefore, without technology for information filtering, it would be very difficult t...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
1997
|
Online Access: | http://ndltd.ncl.edu.tw/handle/95765820324600203140 |
id |
ndltd-TW-085NTU00396014 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-085NTU003960142016-07-01T04:15:37Z http://ndltd.ncl.edu.tw/handle/95765820324600203140 Passage Retrieval Using Latent Semantics Indexing 利用隱藏語意索引進行文件分段檢索之研究 Huang, Jwo-Luen 黃卓倫 碩士 國立臺灣大學 資訊管理學系 85 With the development of information technology and the increasing ofinformation flow, everyone has to face more and more information. Therefore, without technology for information filtering, it would be very difficult to find the needed information. In order to solve this difficulty, the information retrieval therefore developed. When using this technology, users expect this technology help to search for what they really need. The query result must not only match users'' requirement, but also be meaningful to users. Thus, if the query result includes only a small portion of meaningful information, it will be of no value to users.Using current information retrieval system, the target of the system is to return most relevant document to users. But sometimes users expect more precise result like paragraphs, lists, etc.. These "passages" are really meaningful to users. However, current information retrieval algorithms can not match this kind of application. Thus, original algorithms should be modified to meet these requirements.In order to solve the problem of passage retrieval, a passage retrieval system is implemented by using LSI (Latent Semantics Indexing). At the same time the properties of LSI under passage retrieval is investigated. These properties includes optimal query length, optimal word segmentation, optimal document segmentation, impact when appending new documents, and the benefit of relevance feedback.In this research, the passage retrieval system works best when document paragraphs, longer Chinese word, and adequate query length are used. In this research on appending documents using folding-in technique, documents can be appended without re-SVD the document index. A ratio of new document is found to prevent re-computing the matrix. Second, the document vector matrix can be used in passage retrieval. Finally, the research on relevance feedback shows that this technique is useful.Thus, the conclusion is: LSI indeed fits passage and concept retrieval, especially when searching for relevant documents from some passages. Thus, LSI is feasible for passage retrieval. Timothy Chou 曹承礎 1997 學位論文 ; thesis 67 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣大學 === 資訊管理學系 === 85 === With the development of information technology and the
increasing ofinformation flow, everyone has to face more and
more information. Therefore, without technology for information
filtering, it would be very difficult to find the needed
information. In order to solve this difficulty, the information
retrieval therefore developed. When using this technology, users
expect this technology help to search for what they really need.
The query result must not only match users'' requirement, but
also be meaningful to users. Thus, if the query result includes
only a small portion of meaningful information, it will be of no
value to users.Using current information retrieval system, the
target of the system is to return most relevant document to
users. But sometimes users expect more precise result like
paragraphs, lists, etc.. These "passages" are really meaningful
to users. However, current information retrieval algorithms can
not match this kind of application. Thus, original algorithms
should be modified to meet these requirements.In order to solve
the problem of passage retrieval, a passage retrieval system is
implemented by using LSI (Latent Semantics Indexing). At the
same time the properties of LSI under passage retrieval is
investigated. These properties includes optimal query length,
optimal word segmentation, optimal document segmentation, impact
when appending new documents, and the benefit of relevance
feedback.In this research, the passage retrieval system works
best when document paragraphs, longer Chinese word, and adequate
query length are used. In this research on appending documents
using folding-in technique, documents can be appended without
re-SVD the document index. A ratio of new document is found to
prevent re-computing the matrix. Second, the document vector
matrix can be used in passage retrieval. Finally, the research
on relevance feedback shows that this technique is useful.Thus,
the conclusion is: LSI indeed fits passage and concept
retrieval, especially when searching for relevant documents from
some passages. Thus, LSI is feasible for passage retrieval.
|
author2 |
Timothy Chou |
author_facet |
Timothy Chou Huang, Jwo-Luen 黃卓倫 |
author |
Huang, Jwo-Luen 黃卓倫 |
spellingShingle |
Huang, Jwo-Luen 黃卓倫 Passage Retrieval Using Latent Semantics Indexing |
author_sort |
Huang, Jwo-Luen |
title |
Passage Retrieval Using Latent Semantics Indexing |
title_short |
Passage Retrieval Using Latent Semantics Indexing |
title_full |
Passage Retrieval Using Latent Semantics Indexing |
title_fullStr |
Passage Retrieval Using Latent Semantics Indexing |
title_full_unstemmed |
Passage Retrieval Using Latent Semantics Indexing |
title_sort |
passage retrieval using latent semantics indexing |
publishDate |
1997 |
url |
http://ndltd.ncl.edu.tw/handle/95765820324600203140 |
work_keys_str_mv |
AT huangjwoluen passageretrievalusinglatentsemanticsindexing AT huángzhuōlún passageretrievalusinglatentsemanticsindexing AT huangjwoluen lìyòngyǐncángyǔyìsuǒyǐnjìnxíngwénjiànfēnduànjiǎnsuǒzhīyánjiū AT huángzhuōlún lìyòngyǐncángyǔyìsuǒyǐnjìnxíngwénjiànfēnduànjiǎnsuǒzhīyánjiū |
_version_ |
1718328816981508096 |