Approaching Google Ranking with Semantically Related Terms

碩士 === 元智大學 === 資訊管理學系 === 99 === This study aims to approximate Google ranking results using semantically related terms of query. Firstly, we crawled and extracted web page title, snippet and URL from Google search results. Then we found semantically related terms using Latent Semantic Analysis (LS...

Full description

Bibliographic Details
Main Authors: Chun-Ju Li, 李淳如
Other Authors: Cheng-Jye Luh
Format: Others
Language:zh-TW
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/75652668585464020794
Description
Summary:碩士 === 元智大學 === 資訊管理學系 === 99 === This study aims to approximate Google ranking results using semantically related terms of query. Firstly, we crawled and extracted web page title, snippet and URL from Google search results. Then we found semantically related terms using Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) two approaches. Secondly we calculated the scores for keywords in title, keyword in snippet and keyword in URL for obtaining a document score. Several experiments were conducted on different combination of number of semantically related terms, number of documents, uni-gram and n-gram tokenization method, 1 topic and 2 topics of semantically related terms. The experimental results showed the average R-Precision reaches 0.8, indicating the ranking results of the proposed method approximates to Google results.