Relevance feedback-based optimization of search queries for Patents
In this project, we design a search query optimization system based on the user’s relevance feedback by generating customized query strings for existing patent alerts. Firstly, the Rocchio algorithm is used to generate a search string by analyzing the characteristics of related patents and unrelated...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
Linköpings universitet, Interaktiva och kognitiva system
2019
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-154173 |
id |
ndltd-UPSALLA1-oai-DiVA.org-liu-154173 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-liu-1541732019-02-02T06:10:02ZRelevance feedback-based optimization of search queries for PatentsengCheng, SijinLinköpings universitet, Interaktiva och kognitiva system2019Patent SearchQuery ReformulationRecommendation SystemMatrix DecompositionText ProcessingComputer SystemsDatorsystemIn this project, we design a search query optimization system based on the user’s relevance feedback by generating customized query strings for existing patent alerts. Firstly, the Rocchio algorithm is used to generate a search string by analyzing the characteristics of related patents and unrelated patents. Then the collaborative filtering recommendation algorithm is used to rank the query results, which considering the previous relevance feedback and patent features, instead of only considering the similarity between query and patents as the traditional method. In order to further explore the performance of the optimization system, we design and conduct a series of evaluation experiments regarding TF-IDF as a baseline method. Experiments show that, with the use of generated search strings, the proportion of unrelated patents in search results is significantly reduced over time. In 4 months, the precision of the retrieved results is optimized from 53.5% to 72%. What’s more, the rank performance of the method we proposed is better than the baseline method. In terms of precision, top10 of recommendation algorithm is about 5 percentage points higher than the baseline method, and top20 is about 7.5% higher. It can be concluded that the approach we proposed can effectively optimize patent search results by learning relevance feedback. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-154173application/pdfinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
Patent Search Query Reformulation Recommendation System Matrix Decomposition Text Processing Computer Systems Datorsystem |
spellingShingle |
Patent Search Query Reformulation Recommendation System Matrix Decomposition Text Processing Computer Systems Datorsystem Cheng, Sijin Relevance feedback-based optimization of search queries for Patents |
description |
In this project, we design a search query optimization system based on the user’s relevance feedback by generating customized query strings for existing patent alerts. Firstly, the Rocchio algorithm is used to generate a search string by analyzing the characteristics of related patents and unrelated patents. Then the collaborative filtering recommendation algorithm is used to rank the query results, which considering the previous relevance feedback and patent features, instead of only considering the similarity between query and patents as the traditional method. In order to further explore the performance of the optimization system, we design and conduct a series of evaluation experiments regarding TF-IDF as a baseline method. Experiments show that, with the use of generated search strings, the proportion of unrelated patents in search results is significantly reduced over time. In 4 months, the precision of the retrieved results is optimized from 53.5% to 72%. What’s more, the rank performance of the method we proposed is better than the baseline method. In terms of precision, top10 of recommendation algorithm is about 5 percentage points higher than the baseline method, and top20 is about 7.5% higher. It can be concluded that the approach we proposed can effectively optimize patent search results by learning relevance feedback. |
author |
Cheng, Sijin |
author_facet |
Cheng, Sijin |
author_sort |
Cheng, Sijin |
title |
Relevance feedback-based optimization of search queries for Patents |
title_short |
Relevance feedback-based optimization of search queries for Patents |
title_full |
Relevance feedback-based optimization of search queries for Patents |
title_fullStr |
Relevance feedback-based optimization of search queries for Patents |
title_full_unstemmed |
Relevance feedback-based optimization of search queries for Patents |
title_sort |
relevance feedback-based optimization of search queries for patents |
publisher |
Linköpings universitet, Interaktiva och kognitiva system |
publishDate |
2019 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-154173 |
work_keys_str_mv |
AT chengsijin relevancefeedbackbasedoptimizationofsearchqueriesforpatents |
_version_ |
1718969926580961280 |