Comirror: A Search Engine for Comic Web Based on Textual and Visual Correlation Mining

碩士 === 國立政治大學 === 資訊科學學系 === 102 === Animations, comics, and games (ACG) have become more and more popular in recent years. There exist many ACG web sites which contain lots of textual and visual information on stories, characters and authors of animations, comics and games. Most ACG web sites prov...

Full description

Bibliographic Details
Main Authors: Sun, Shi Tong, 孫世通
Other Authors: Shan, Man Kwan
Format: Others
Language:zh-TW
Online Access:http://ndltd.ncl.edu.tw/handle/c58bc9
Description
Summary:碩士 === 國立政治大學 === 資訊科學學系 === 102 === Animations, comics, and games (ACG) have become more and more popular in recent years. There exist many ACG web sites which contain lots of textual and visual information on stories, characters and authors of animations, comics and games. Most ACG web sites provide users text retrieval capability to search for textual contents. However, there is a need for users to search for textual and visual contents by styles. Examples of styles are drawing styles of comic characters, narrative styles of animation stories and so on. In order to help users to search for textual and visual contents by similar styles, this thesis investigates and develops a search engine, Comirror, for ACG web sites based on latent correlation between textual and visual contents. First, while facial styles of characters play important roles in ACG, after comic face detection, Local Binary Pattern (LBP) along with gray-value histogram is utilized to extract and represent the visual features. For the textual contents, traditional full-text indexing technique is employed to extract textual features. Then, hierarchical clustering is performed to quantize and transform the textual and visual features into textual and visual words. Finally, Latent Dirichlet Allocation (LDA) is utilized to discover the latent semantic correlation between visual and textual words. Experiments show that the developed approach performs better than the other baseline approaches.