Semantic Query Expansion for Content-based Image Retrieval and Its Application on Video Advertising
碩士 === 臺灣大學 === 資訊工程學研究所 === 98 === Content-based image retrieval (CBIR) is one of the essential techniques for managing exponentially growing photos and the enabling technology for many applications such as annotation by search, computational photography, photo-based question and answering, etc. Th...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2010
|
Online Access: | http://ndltd.ncl.edu.tw/handle/99958344443046494763 |
id |
ndltd-TW-098NTU05392008 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-098NTU053920082015-10-13T13:43:18Z http://ndltd.ncl.edu.tw/handle/99958344443046494763 Semantic Query Expansion for Content-based Image Retrieval and Its Application on Video Advertising 基於圖片內容擷取之圖像語意擴充與其影片廣告安插之應用 Wei-Shing Liao 廖瑋星 碩士 臺灣大學 資訊工程學研究所 98 Content-based image retrieval (CBIR) is one of the essential techniques for managing exponentially growing photos and the enabling technology for many applications such as annotation by search, computational photography, photo-based question and answering, etc. Though through decades of research, current solutions are limited due to the “semantic gap.” In this work, we argue to improve CBIR by automatically exploiting the auxiliary knowledge (i.e., tags, photos, blogs, metadata, etc.) from the booming media-sharing services (e.g., Flickr) and search engines (e.g., Google); i.e., finding more semantic-related images to enhance CBIR results. To demonstrate the benefits of semantic expansion, we further apply the proposed framework in a promising application – video advertising by target image matching, which automatically associates the relevant ads by content-based matching over related image objects (e.g., logos, scenes, etc.). It is one of the potential applications for Internet monetization as the prevalence of shared videos. Since traditional CBIR methods are hindered by the semantic gap, our approach argues to leverage content and context information widely available in media-sharing services. However, such community-contributed cues (e.g., tags, descriptions, image appearance, metadata) are generally noisy. We measure the semantic similarities between tags by exploiting Google knowledge. We apply graph-based approach, one graph model for each cue. Through aggregating multiple cues (as graphs) in a linear manner, we perform random walk on a unified graph originating from the initial CBIR search results to further improve the precision and recall rates. Since each graph is built independently, the weights for them are adaptive for different applications. We also consider the efficiency issues as deploying in the large-scale media-sharing sites. Meanwhile, the framework is generic and can be extended for other applications such as keyword-based image retrieval and image annotation. To demonstrate the benefits of semantic expansion, we also propose a framework called AdVis, which automatically associates the relevant ads by visual matching. Here, bidder bids ads by the interested images – adImages, which is analogous to keywords in AdWords model. AdVis aims to maximize system revenue and user perception. We formulate the solution as a nonlinear 1-0 integer programming problem. Experimenting over Flickr photo benchmarks, the proposed semantic expansion framework performs saliently in a few aspects: (1) example-based image retrieval: outperforming traditional CBIR systems up to 200%; (2) text-based image retrieval: salient performance gains over conventional keyword-based search since the latter suffers from noisy and missing tags; (3) image auto-annotation: showing significant gains over other search-based image annotation approaches. Winston H. Hsu 徐宏民 2010 學位論文 ; thesis 59 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 臺灣大學 === 資訊工程學研究所 === 98 === Content-based image retrieval (CBIR) is one of the essential techniques for managing exponentially growing photos and the enabling technology for many applications such as annotation by search, computational photography, photo-based question and answering, etc. Though through decades of research, current solutions are limited due to the “semantic gap.” In this work, we argue to improve CBIR by automatically exploiting the auxiliary knowledge (i.e., tags, photos, blogs, metadata, etc.) from the booming media-sharing services (e.g., Flickr) and search engines (e.g., Google); i.e., finding more semantic-related images to enhance CBIR results. To demonstrate the benefits of semantic expansion, we further apply the proposed framework in a promising application – video advertising by target image matching, which automatically associates the relevant ads by content-based matching over related image objects (e.g., logos, scenes, etc.). It is one of the potential applications for Internet monetization as the prevalence of shared videos.
Since traditional CBIR methods are hindered by the semantic gap, our approach argues to leverage content and context information widely available in media-sharing services. However, such community-contributed cues (e.g., tags, descriptions, image appearance, metadata) are generally noisy. We measure the semantic similarities between tags by exploiting Google knowledge. We apply graph-based approach, one graph model for each cue. Through aggregating multiple cues (as graphs) in a linear manner, we perform random walk on a unified graph originating from the initial CBIR search results to further improve the precision and recall rates. Since each graph is built independently, the weights for them are adaptive for different applications. We also consider the efficiency issues as deploying in the large-scale media-sharing sites. Meanwhile, the framework is generic and can be extended for other applications such as keyword-based image retrieval and image annotation.
To demonstrate the benefits of semantic expansion, we also propose a framework called AdVis, which automatically associates the relevant ads by visual matching. Here, bidder bids ads by the interested images – adImages, which is analogous to keywords in AdWords model. AdVis aims to maximize system revenue and user perception. We formulate the solution as a nonlinear 1-0 integer programming problem.
Experimenting over Flickr photo benchmarks, the proposed semantic expansion framework performs saliently in a few aspects: (1) example-based image retrieval: outperforming traditional CBIR systems up to 200%; (2) text-based image retrieval: salient performance gains over conventional keyword-based search since the latter suffers from noisy and missing tags; (3) image auto-annotation: showing significant gains over other search-based image annotation approaches.
|
author2 |
Winston H. Hsu |
author_facet |
Winston H. Hsu Wei-Shing Liao 廖瑋星 |
author |
Wei-Shing Liao 廖瑋星 |
spellingShingle |
Wei-Shing Liao 廖瑋星 Semantic Query Expansion for Content-based Image Retrieval and Its Application on Video Advertising |
author_sort |
Wei-Shing Liao |
title |
Semantic Query Expansion for Content-based Image Retrieval and Its Application on Video Advertising |
title_short |
Semantic Query Expansion for Content-based Image Retrieval and Its Application on Video Advertising |
title_full |
Semantic Query Expansion for Content-based Image Retrieval and Its Application on Video Advertising |
title_fullStr |
Semantic Query Expansion for Content-based Image Retrieval and Its Application on Video Advertising |
title_full_unstemmed |
Semantic Query Expansion for Content-based Image Retrieval and Its Application on Video Advertising |
title_sort |
semantic query expansion for content-based image retrieval and its application on video advertising |
publishDate |
2010 |
url |
http://ndltd.ncl.edu.tw/handle/99958344443046494763 |
work_keys_str_mv |
AT weishingliao semanticqueryexpansionforcontentbasedimageretrievalanditsapplicationonvideoadvertising AT liàowěixīng semanticqueryexpansionforcontentbasedimageretrievalanditsapplicationonvideoadvertising AT weishingliao jīyútúpiànnèiróngxiéqǔzhītúxiàngyǔyìkuòchōngyǔqíyǐngpiànguǎnggàoānchāzhīyīngyòng AT liàowěixīng jīyútúpiànnèiróngxiéqǔzhītúxiàngyǔyìkuòchōngyǔqíyǐngpiànguǎnggàoānchāzhīyīngyòng |
_version_ |
1717740649323692032 |