A Novel Algorithm Using Link Information To Discover Alternative Pages For 404 Errors

碩士 === 國防大學理工學院 === 資訊工程碩士班 === 100 === Broken links are links that lead to websites that do not exist, which are due to websites removed or their URLs changed. Broken links will significantly reduce reference, source citation, and cause the incomplete information of a website. There are two traditi...

Full description

Bibliographic Details
Main Authors: Liao,Yishang, 廖詒旋
Other Authors: 陳善泰
Format: Others
Language:zh-TW
Published: 2012
Online Access:http://ndltd.ncl.edu.tw/handle/64782210512182981687
id ndltd-TW-100CCIT0394016
record_format oai_dc
spelling ndltd-TW-100CCIT03940162015-10-13T21:02:32Z http://ndltd.ncl.edu.tw/handle/64782210512182981687 A Novel Algorithm Using Link Information To Discover Alternative Pages For 404 Errors 利用連結資訊探勘失效網頁取代演算法 Liao,Yishang 廖詒旋 碩士 國防大學理工學院 資訊工程碩士班 100 Broken links are links that lead to websites that do not exist, which are due to websites removed or their URLs changed. Broken links will significantly reduce reference, source citation, and cause the incomplete information of a website. There are two traditional ways to repair broken links, namely index servers and search engines. Index servers cannot rapidly react and update broken links resulting from website movement. On the other hand, search engines may discover a great number of similar websites, but they cannot identify the original one. Furthermore, the selection of keywords would seriously affect the search results. Therefore, both the two methods are not appropriate to deal with the problem. This research proposes a novel algorithm that uses link information to (1) recover broken links and (2) discover alternative pages for 404 errors, and then achieves the following results: 1. Developing the Broken New Page Finding (BNPF) algorithm and implementing a BNPF system that realizes the algorithm. 2. Proveing the theorem that if a URL of a website has been changed, BNPF can guarantee to efficiently discover the new website. 3. If a website has been removed, BNPF can find an alternative website that is similar to the original one. Experimental results show that BNPF obtains both higher similarity and hit rate than Google search. 陳善泰 2012 學位論文 ; thesis 53 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國防大學理工學院 === 資訊工程碩士班 === 100 === Broken links are links that lead to websites that do not exist, which are due to websites removed or their URLs changed. Broken links will significantly reduce reference, source citation, and cause the incomplete information of a website. There are two traditional ways to repair broken links, namely index servers and search engines. Index servers cannot rapidly react and update broken links resulting from website movement. On the other hand, search engines may discover a great number of similar websites, but they cannot identify the original one. Furthermore, the selection of keywords would seriously affect the search results. Therefore, both the two methods are not appropriate to deal with the problem. This research proposes a novel algorithm that uses link information to (1) recover broken links and (2) discover alternative pages for 404 errors, and then achieves the following results: 1. Developing the Broken New Page Finding (BNPF) algorithm and implementing a BNPF system that realizes the algorithm. 2. Proveing the theorem that if a URL of a website has been changed, BNPF can guarantee to efficiently discover the new website. 3. If a website has been removed, BNPF can find an alternative website that is similar to the original one. Experimental results show that BNPF obtains both higher similarity and hit rate than Google search.
author2 陳善泰
author_facet 陳善泰
Liao,Yishang
廖詒旋
author Liao,Yishang
廖詒旋
spellingShingle Liao,Yishang
廖詒旋
A Novel Algorithm Using Link Information To Discover Alternative Pages For 404 Errors
author_sort Liao,Yishang
title A Novel Algorithm Using Link Information To Discover Alternative Pages For 404 Errors
title_short A Novel Algorithm Using Link Information To Discover Alternative Pages For 404 Errors
title_full A Novel Algorithm Using Link Information To Discover Alternative Pages For 404 Errors
title_fullStr A Novel Algorithm Using Link Information To Discover Alternative Pages For 404 Errors
title_full_unstemmed A Novel Algorithm Using Link Information To Discover Alternative Pages For 404 Errors
title_sort novel algorithm using link information to discover alternative pages for 404 errors
publishDate 2012
url http://ndltd.ncl.edu.tw/handle/64782210512182981687
work_keys_str_mv AT liaoyishang anovelalgorithmusinglinkinformationtodiscoveralternativepagesfor404errors
AT liàoyíxuán anovelalgorithmusinglinkinformationtodiscoveralternativepagesfor404errors
AT liaoyishang lìyòngliánjiézīxùntànkānshīxiàowǎngyèqǔdàiyǎnsuànfǎ
AT liàoyíxuán lìyòngliánjiézīxùntànkānshīxiàowǎngyèqǔdàiyǎnsuànfǎ
AT liaoyishang novelalgorithmusinglinkinformationtodiscoveralternativepagesfor404errors
AT liàoyíxuán novelalgorithmusinglinkinformationtodiscoveralternativepagesfor404errors
_version_ 1718053727413207040