Passage Retrieval Using Latent Semantics Indexing

碩士 === 國立臺灣大學 === 資訊管理學系 === 85 === With the development of information technology and the increasing ofinformation flow, everyone has to face more and more information. Therefore, without technology for information filtering, it would be very difficult t...

Full description

Bibliographic Details
Main Authors: Huang, Jwo-Luen, 黃卓倫
Other Authors: Timothy Chou
Format: Others
Language:zh-TW
Published: 1997
Online Access:http://ndltd.ncl.edu.tw/handle/95765820324600203140
id ndltd-TW-085NTU00396014
record_format oai_dc
spelling ndltd-TW-085NTU003960142016-07-01T04:15:37Z http://ndltd.ncl.edu.tw/handle/95765820324600203140 Passage Retrieval Using Latent Semantics Indexing 利用隱藏語意索引進行文件分段檢索之研究 Huang, Jwo-Luen 黃卓倫 碩士 國立臺灣大學 資訊管理學系 85 With the development of information technology and the increasing ofinformation flow, everyone has to face more and more information. Therefore, without technology for information filtering, it would be very difficult to find the needed information. In order to solve this difficulty, the information retrieval therefore developed. When using this technology, users expect this technology help to search for what they really need. The query result must not only match users'' requirement, but also be meaningful to users. Thus, if the query result includes only a small portion of meaningful information, it will be of no value to users.Using current information retrieval system, the target of the system is to return most relevant document to users. But sometimes users expect more precise result like paragraphs, lists, etc.. These "passages" are really meaningful to users. However, current information retrieval algorithms can not match this kind of application. Thus, original algorithms should be modified to meet these requirements.In order to solve the problem of passage retrieval, a passage retrieval system is implemented by using LSI (Latent Semantics Indexing). At the same time the properties of LSI under passage retrieval is investigated. These properties includes optimal query length, optimal word segmentation, optimal document segmentation, impact when appending new documents, and the benefit of relevance feedback.In this research, the passage retrieval system works best when document paragraphs, longer Chinese word, and adequate query length are used. In this research on appending documents using folding-in technique, documents can be appended without re-SVD the document index. A ratio of new document is found to prevent re-computing the matrix. Second, the document vector matrix can be used in passage retrieval. Finally, the research on relevance feedback shows that this technique is useful.Thus, the conclusion is: LSI indeed fits passage and concept retrieval, especially when searching for relevant documents from some passages. Thus, LSI is feasible for passage retrieval. Timothy Chou 曹承礎 1997 學位論文 ; thesis 67 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺灣大學 === 資訊管理學系 === 85 === With the development of information technology and the increasing ofinformation flow, everyone has to face more and more information. Therefore, without technology for information filtering, it would be very difficult to find the needed information. In order to solve this difficulty, the information retrieval therefore developed. When using this technology, users expect this technology help to search for what they really need. The query result must not only match users'' requirement, but also be meaningful to users. Thus, if the query result includes only a small portion of meaningful information, it will be of no value to users.Using current information retrieval system, the target of the system is to return most relevant document to users. But sometimes users expect more precise result like paragraphs, lists, etc.. These "passages" are really meaningful to users. However, current information retrieval algorithms can not match this kind of application. Thus, original algorithms should be modified to meet these requirements.In order to solve the problem of passage retrieval, a passage retrieval system is implemented by using LSI (Latent Semantics Indexing). At the same time the properties of LSI under passage retrieval is investigated. These properties includes optimal query length, optimal word segmentation, optimal document segmentation, impact when appending new documents, and the benefit of relevance feedback.In this research, the passage retrieval system works best when document paragraphs, longer Chinese word, and adequate query length are used. In this research on appending documents using folding-in technique, documents can be appended without re-SVD the document index. A ratio of new document is found to prevent re-computing the matrix. Second, the document vector matrix can be used in passage retrieval. Finally, the research on relevance feedback shows that this technique is useful.Thus, the conclusion is: LSI indeed fits passage and concept retrieval, especially when searching for relevant documents from some passages. Thus, LSI is feasible for passage retrieval.
author2 Timothy Chou
author_facet Timothy Chou
Huang, Jwo-Luen
黃卓倫
author Huang, Jwo-Luen
黃卓倫
spellingShingle Huang, Jwo-Luen
黃卓倫
Passage Retrieval Using Latent Semantics Indexing
author_sort Huang, Jwo-Luen
title Passage Retrieval Using Latent Semantics Indexing
title_short Passage Retrieval Using Latent Semantics Indexing
title_full Passage Retrieval Using Latent Semantics Indexing
title_fullStr Passage Retrieval Using Latent Semantics Indexing
title_full_unstemmed Passage Retrieval Using Latent Semantics Indexing
title_sort passage retrieval using latent semantics indexing
publishDate 1997
url http://ndltd.ncl.edu.tw/handle/95765820324600203140
work_keys_str_mv AT huangjwoluen passageretrievalusinglatentsemanticsindexing
AT huángzhuōlún passageretrievalusinglatentsemanticsindexing
AT huangjwoluen lìyòngyǐncángyǔyìsuǒyǐnjìnxíngwénjiànfēnduànjiǎnsuǒzhīyánjiū
AT huángzhuōlún lìyòngyǐncángyǔyìsuǒyǐnjìnxíngwénjiànfēnduànjiǎnsuǒzhīyánjiū
_version_ 1718328816981508096