Summary: | 碩士 === 國立臺灣大學 === 資訊工程學研究所 === 98 === Witnessing the sheer amount of user-contributed photos and videos, we argue to leverage such freely available image collections as the training images for image classification. We propose an image expansion framework to mine more semantically related training images provided very few training examples. The expansion is based on a semantic graph considering both visual and (noisy) textual similarities in the auxiliary image collections, where we also consider scalability issues (e.g., MapReduce) as constructing the graph. We found the expanded images not only reduce the time-consuming annotation efforts but also further improve the classification accuracy since including more visually diverse training images given the limited training images. Experimenting in certain benchmarks, we show that the expanded training images improve image classification significantly. Furthermore, we can achieve more than 25% relative improvement in accuracy compared to existing state-of-the-art methods similarly aiming to mine training images from such media sharing services (i.e., Flickr).
|