Solving the broken link problem in Walden's Paths

With the extent of the web expanding at an increasing rate, the problems caused by broken links are reaching epidemic proportions. Studies have indicated that a substantial number of links on the Internet are broken. User surveys indicate broken links are considered the third biggest problem faced o...

Full description

Bibliographic Details
Main Author: Dalal, Zubin Jamshed
Other Authors: Furuta, Richard
Format: Others
Language:en_US
Published: Texas A&M University 2004
Subjects:
Online Access:http://hdl.handle.net/1969.1/539
id ndltd-tamu.edu-oai-repository.tamu.edu-1969.1-539
record_format oai_dc
spelling ndltd-tamu.edu-oai-repository.tamu.edu-1969.1-5392013-01-08T10:37:24ZSolving the broken link problem in Walden's PathsDalal, Zubin Jamshedbroken link problemwalden's Pathskeyphrase extractionWith the extent of the web expanding at an increasing rate, the problems caused by broken links are reaching epidemic proportions. Studies have indicated that a substantial number of links on the Internet are broken. User surveys indicate broken links are considered the third biggest problem faced on the Internet. Currently Walden's Paths Path Manager tool is capable of detecting the degree and type of change within a page in a path. Although it also has the ability to highlight missing pages or broken links, it has no method of correcting them thus leaving the broken link problem unsolved. This thesis proposes a solution to this problem in Walden's Paths. The solution centers on the idea that "significant" keyphrases extracted from the original page can be used to accurately locate the document using a search engine. This thesis proposes an algorithm to extract representative keyphrases to locate exact copies of the original page. In the absence of an exact copy, a similar but separate algorithm is used to extract keyphrases that will help locating similar pages that can be substituted in place of the missing page. Both sets of keyphrases are stored as additions to the page signature in the Path Manager tool and can be used when the original page is removed from its current location on the Web.Texas A&M UniversityFuruta, Richard2004-09-30T02:09:22Z2004-09-30T02:09:22Z2003-082004-09-30T02:09:22ZBookThesisElectronic Thesistext494376 bytes77021 byteselectronicapplication/pdftext/plainborn digitalhttp://hdl.handle.net/1969.1/539en_US
collection NDLTD
language en_US
format Others
sources NDLTD
topic broken link problem
walden's Paths
keyphrase extraction
spellingShingle broken link problem
walden's Paths
keyphrase extraction
Dalal, Zubin Jamshed
Solving the broken link problem in Walden's Paths
description With the extent of the web expanding at an increasing rate, the problems caused by broken links are reaching epidemic proportions. Studies have indicated that a substantial number of links on the Internet are broken. User surveys indicate broken links are considered the third biggest problem faced on the Internet. Currently Walden's Paths Path Manager tool is capable of detecting the degree and type of change within a page in a path. Although it also has the ability to highlight missing pages or broken links, it has no method of correcting them thus leaving the broken link problem unsolved. This thesis proposes a solution to this problem in Walden's Paths. The solution centers on the idea that "significant" keyphrases extracted from the original page can be used to accurately locate the document using a search engine. This thesis proposes an algorithm to extract representative keyphrases to locate exact copies of the original page. In the absence of an exact copy, a similar but separate algorithm is used to extract keyphrases that will help locating similar pages that can be substituted in place of the missing page. Both sets of keyphrases are stored as additions to the page signature in the Path Manager tool and can be used when the original page is removed from its current location on the Web.
author2 Furuta, Richard
author_facet Furuta, Richard
Dalal, Zubin Jamshed
author Dalal, Zubin Jamshed
author_sort Dalal, Zubin Jamshed
title Solving the broken link problem in Walden's Paths
title_short Solving the broken link problem in Walden's Paths
title_full Solving the broken link problem in Walden's Paths
title_fullStr Solving the broken link problem in Walden's Paths
title_full_unstemmed Solving the broken link problem in Walden's Paths
title_sort solving the broken link problem in walden's paths
publisher Texas A&M University
publishDate 2004
url http://hdl.handle.net/1969.1/539
work_keys_str_mv AT dalalzubinjamshed solvingthebrokenlinkprobleminwaldenspaths
_version_ 1716503035554824192