Development of an information retrieval and distillation agent
Though a large number of search engines are commercially available today, the use of most of them often involves tedious human efforts. Also, a large amount of information obtained using the existing search engines may or may not be relevant to the intended query. Furthermore, there is a lack of sys...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Language: | en |
Published: |
University of Ottawa (Canada)
2013
|
Subjects: | |
Online Access: | http://hdl.handle.net/10393/26514 http://dx.doi.org/10.20381/ruor-18223 |
id |
ndltd-uottawa.ca-oai-ruor.uottawa.ca-10393-26514 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-uottawa.ca-oai-ruor.uottawa.ca-10393-265142018-01-05T19:07:09Z Development of an information retrieval and distillation agent Liu, Yongsheng Liang, Ming, Computer Science. Though a large number of search engines are commercially available today, the use of most of them often involves tedious human efforts. Also, a large amount of information obtained using the existing search engines may or may not be relevant to the intended query. Furthermore, there is a lack of systematic approach to quantify the value of the information for the user's needs. In this thesis, to free the user from the drudgery of the search and to provide a basis for building personalized database for a particular topic, we develop a web search and distillation agent. To retrieve the information with higher quality, we modified the existing Term frequency vs Inverse Document Frequency (TFIDF) term weighting scheme and combined it with the Hyperlink Induced Topic Search (HITS) method to create a solution measuring both importance and relevancy of a document. To construct a dynamic graph and ensure an affordable continuous search, we propose a Sliding Window Model (SWM) which is used to control the size of the node set of a graph. To improve the intelligence of the search agent, we employ the Exponential Smoothing (ES) approach to guide the search. Our experimental results show that the proposed web search and distillation approach with the above features is effective compared to other algorithms and models: the improved TFIDF algorithm improves the rationality of the search results; the proposed SWM can control the size of the node set as expected; the ES algorithm employed in SWM can further save computing time and help the search agent harvest the information with higher quality, and gains much more advantages compared to other methods implemented in the search agent. 2013-11-07T17:24:47Z 2013-11-07T17:24:47Z 2003 2003 Thesis Source: Masters Abstracts International, Volume: 42-06, page: 2237. http://hdl.handle.net/10393/26514 http://dx.doi.org/10.20381/ruor-18223 en 195 p. University of Ottawa (Canada) |
collection |
NDLTD |
language |
en |
format |
Others
|
sources |
NDLTD |
topic |
Computer Science. |
spellingShingle |
Computer Science. Liu, Yongsheng Development of an information retrieval and distillation agent |
description |
Though a large number of search engines are commercially available today, the use of most of them often involves tedious human efforts. Also, a large amount of information obtained using the existing search engines may or may not be relevant to the intended query. Furthermore, there is a lack of systematic approach to quantify the value of the information for the user's needs. In this thesis, to free the user from the drudgery of the search and to provide a basis for building personalized database for a particular topic, we develop a web search and distillation agent. To retrieve the information with higher quality, we modified the existing Term frequency vs Inverse Document Frequency (TFIDF) term weighting scheme and combined it with the Hyperlink Induced Topic Search (HITS) method to create a solution measuring both importance and relevancy of a document. To construct a dynamic graph and ensure an affordable continuous search, we propose a Sliding Window Model (SWM) which is used to control the size of the node set of a graph. To improve the intelligence of the search agent, we employ the Exponential Smoothing (ES) approach to guide the search.
Our experimental results show that the proposed web search and distillation approach with the above features is effective compared to other algorithms and models: the improved TFIDF algorithm improves the rationality of the search results; the proposed SWM can control the size of the node set as expected; the ES algorithm employed in SWM can further save computing time and help the search agent harvest the information with higher quality, and gains much more advantages compared to other methods implemented in the search agent. |
author2 |
Liang, Ming, |
author_facet |
Liang, Ming, Liu, Yongsheng |
author |
Liu, Yongsheng |
author_sort |
Liu, Yongsheng |
title |
Development of an information retrieval and distillation agent |
title_short |
Development of an information retrieval and distillation agent |
title_full |
Development of an information retrieval and distillation agent |
title_fullStr |
Development of an information retrieval and distillation agent |
title_full_unstemmed |
Development of an information retrieval and distillation agent |
title_sort |
development of an information retrieval and distillation agent |
publisher |
University of Ottawa (Canada) |
publishDate |
2013 |
url |
http://hdl.handle.net/10393/26514 http://dx.doi.org/10.20381/ruor-18223 |
work_keys_str_mv |
AT liuyongsheng developmentofaninformationretrievalanddistillationagent |
_version_ |
1718601969493344256 |