Summary: | 碩士 === 國立交通大學 === 資訊科學與工程研究所 === 99 === With a huge amount of bibliographic datasets, existing on-line academic search services are now widely available. Most of on-line academic search retrieve those papers that have their terms in the titles or abstracts matched query terms. As such, the drawback of keyword-matching problem exists in the query results. In this paper, we explore Mixed Media Graph (abbreviated as MMG) in which each vertex represents one entity and edges reflect linkage relationships. Note that vertexes in MMG may represent different entity types, such as papers, authors and terms. Thus, MMG fully reflects linkage relationships among different entities. Note that prior works have demonstrated that by using similarity search via cross-entity and identical-entity relationships, MMG is able to retrieve more relevant entities. Furthermore, our proposed academic search could provide a variety of query results, such as relevant papers, relevant authors and relevant conferences, via one-time query. Once MMG is used, when a user submits a query, we explore Random Walk with Restart (abbreviated as RWR) to retrieve and determine ranking scores of relevant entities. Explicitly, given a whole bibliographic dataset, we propose Global-MMG in which a global MMG graph is built for RWR. To reduce the query response time, we further develop Net-MMG (standing for NetClus based MMG) which performs RWR in topic-based sub-graphs derived by prior work NetClus). We implement our academic search and conduct extensive experiments on ACM Digital Library to evaluate our proposed Global-MMG and Net-MMG. Experimental results show that by exploring MMG and RWR, both Global-MMG and Net-MMG are able to have good precision and accuracy. In addition, Net-MMG has short query response time while still guaranteeing good quality of query results.
|