Efficient data structures for information retrieval

This dissertation deals with the application of efficient data structures and hashing algorithms to the problems of textual information storage and retrieval. We have developed static and dynamic techniques for handling large dictionaries, inverted lists, and optimizations applied to ranking algorit...

Full description

Bibliographic Details
Main Author: Daoud, Amjad M.
Other Authors: Computer Science and Applications
Format: Others
Language:en
Published: Virginia Tech 2014
Subjects:
Online Access:http://hdl.handle.net/10919/40031
http://scholar.lib.vt.edu/theses/available/etd-10202005-102821/
id ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-40031
record_format oai_dc
spelling ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-400312021-11-11T05:32:54Z Efficient data structures for information retrieval Daoud, Amjad M. Computer Science and Applications Fox, Edward A. Heath, Lenwood S. Kafura, Dennis G. Shaffer, Clifford A. Brown, Ezra A. LD5655.V856 1993.D368 Data structures (Computer science) Information storage and retrieval systems This dissertation deals with the application of efficient data structures and hashing algorithms to the problems of textual information storage and retrieval. We have developed static and dynamic techniques for handling large dictionaries, inverted lists, and optimizations applied to ranking algorithms. We have carried out an experiment called REVTOLC that demonstrated the efficiency and applicability of our algorithms and data structures. Also, the REVTOLC experiment revealed the effectiveness and ease of use of advanced information retrieval methods, namely extended Boolean (p-norm), vector, and vector with probabilistic feedback methods. We have developed efficient static and dynamic data structures and linear algorithms to find a class of minimal perfect hash functions for the efficient implementation of dictionaries, inverted lists, and stop lists. Further, we have developed a linear algorithm that produces order preserving minimal perfect hash functions. These data structures and algorithms enable much faster indexing of textual data and faster retrieval of best match documents using advanced information retrieval methods. Finally, we summarize our research findings and some open problems that are worth further investigation. Ph. D. 2014-03-14T21:21:48Z 2014-03-14T21:21:48Z 1993-08-05 2005-10-20 2005-10-20 2005-10-20 Dissertation Text etd-10202005-102821 http://hdl.handle.net/10919/40031 http://scholar.lib.vt.edu/theses/available/etd-10202005-102821/ en OCLC# 29179633 LD5655.V856_1993.D368.pdf In Copyright http://rightsstatements.org/vocab/InC/1.0/ xiv, 183 leaves BTD application/pdf application/pdf Virginia Tech
collection NDLTD
language en
format Others
sources NDLTD
topic LD5655.V856 1993.D368
Data structures (Computer science)
Information storage and retrieval systems
spellingShingle LD5655.V856 1993.D368
Data structures (Computer science)
Information storage and retrieval systems
Daoud, Amjad M.
Efficient data structures for information retrieval
description This dissertation deals with the application of efficient data structures and hashing algorithms to the problems of textual information storage and retrieval. We have developed static and dynamic techniques for handling large dictionaries, inverted lists, and optimizations applied to ranking algorithms. We have carried out an experiment called REVTOLC that demonstrated the efficiency and applicability of our algorithms and data structures. Also, the REVTOLC experiment revealed the effectiveness and ease of use of advanced information retrieval methods, namely extended Boolean (p-norm), vector, and vector with probabilistic feedback methods. We have developed efficient static and dynamic data structures and linear algorithms to find a class of minimal perfect hash functions for the efficient implementation of dictionaries, inverted lists, and stop lists. Further, we have developed a linear algorithm that produces order preserving minimal perfect hash functions. These data structures and algorithms enable much faster indexing of textual data and faster retrieval of best match documents using advanced information retrieval methods. Finally, we summarize our research findings and some open problems that are worth further investigation. === Ph. D.
author2 Computer Science and Applications
author_facet Computer Science and Applications
Daoud, Amjad M.
author Daoud, Amjad M.
author_sort Daoud, Amjad M.
title Efficient data structures for information retrieval
title_short Efficient data structures for information retrieval
title_full Efficient data structures for information retrieval
title_fullStr Efficient data structures for information retrieval
title_full_unstemmed Efficient data structures for information retrieval
title_sort efficient data structures for information retrieval
publisher Virginia Tech
publishDate 2014
url http://hdl.handle.net/10919/40031
http://scholar.lib.vt.edu/theses/available/etd-10202005-102821/
work_keys_str_mv AT daoudamjadm efficientdatastructuresforinformationretrieval
_version_ 1719493409068023808