Leveraging Dominant Language Image Tags for Automatic Image Annotation in Minor Languages

Image annotations, often in the form of tags, are very useful when indexing large image collections. They provide an intuitive human centered way to search and browse images using text queries. However, tagging images is very time consuming to do manually so researchers have developed methods for au...

Full description

Bibliographic Details
Main Author:	Wennerström, Hjalmar
Format:	Others
Language:	English
Published:	Uppsala universitet, Institutionen för informationsteknologi 2010
Online Access:	http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-129446

id	ndltd-UPSALLA1-oai-DiVA.org-uu-129446
record_format	oai_dc
spelling	ndltd-UPSALLA1-oai-DiVA.org-uu-1294462013-01-08T13:49:21ZLeveraging Dominant Language Image Tags for Automatic Image Annotation in Minor LanguagesengWennerström, HjalmarUppsala universitet, Institutionen för informationsteknologi2010Image annotations, often in the form of tags, are very useful when indexing large image collections. They provide an intuitive human centered way to search and browse images using text queries. However, tagging images is very time consuming to do manually so researchers have developed methods for automatic image tagging. These methods rely on a set of example images with tags to learn what images should be associated with which tags. One thing that has been overlooked with these systems is the fact that example images with tags are different in each language. Generally researchers have only made English automatic tagging systems and not considered the problems of building equally good systems in other minor languages where it is more difficult to obtain example images and tags. In this thesis we study how an automatic tagging system in Japanese compares to an automatic tagging system in English. We find that the Japanese system suffers in performance and based on this we improve the performance by leveraging the dominant English language system. We compare an automatic translation of the tags using a dictionary to our proposed translation matrix method. Our method estimates the translation of tags based on the co-occurrence of different language tags in images. We show that our proposed method using very simple heuristics performs about the same as a high end machine translator in the case of automatic tagging systems. There are several improvements to be made but with this work we show that the conceptual idea is strong, giving reasons to improve it further. The main contribution of our approach is the ability to translate words that a dictionary cannot interpret as well as considering the context when establishing a translation. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-129446UPTEC IT, 1401-5749 ; 10 013application/pdfinfo:eu-repo/semantics/openAccess
collection	NDLTD
language	English
format	Others
sources	NDLTD
description	Image annotations, often in the form of tags, are very useful when indexing large image collections. They provide an intuitive human centered way to search and browse images using text queries. However, tagging images is very time consuming to do manually so researchers have developed methods for automatic image tagging. These methods rely on a set of example images with tags to learn what images should be associated with which tags. One thing that has been overlooked with these systems is the fact that example images with tags are different in each language. Generally researchers have only made English automatic tagging systems and not considered the problems of building equally good systems in other minor languages where it is more difficult to obtain example images and tags. In this thesis we study how an automatic tagging system in Japanese compares to an automatic tagging system in English. We find that the Japanese system suffers in performance and based on this we improve the performance by leveraging the dominant English language system. We compare an automatic translation of the tags using a dictionary to our proposed translation matrix method. Our method estimates the translation of tags based on the co-occurrence of different language tags in images. We show that our proposed method using very simple heuristics performs about the same as a high end machine translator in the case of automatic tagging systems. There are several improvements to be made but with this work we show that the conceptual idea is strong, giving reasons to improve it further. The main contribution of our approach is the ability to translate words that a dictionary cannot interpret as well as considering the context when establishing a translation.
author	Wennerström, Hjalmar
spellingShingle	Wennerström, Hjalmar Leveraging Dominant Language Image Tags for Automatic Image Annotation in Minor Languages
author_facet	Wennerström, Hjalmar
author_sort	Wennerström, Hjalmar
title	Leveraging Dominant Language Image Tags for Automatic Image Annotation in Minor Languages
title_short	Leveraging Dominant Language Image Tags for Automatic Image Annotation in Minor Languages
title_full	Leveraging Dominant Language Image Tags for Automatic Image Annotation in Minor Languages
title_fullStr	Leveraging Dominant Language Image Tags for Automatic Image Annotation in Minor Languages
title_full_unstemmed	Leveraging Dominant Language Image Tags for Automatic Image Annotation in Minor Languages
title_sort	leveraging dominant language image tags for automatic image annotation in minor languages
publisher	Uppsala universitet, Institutionen för informationsteknologi
publishDate	2010
url	http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-129446
work_keys_str_mv	AT wennerstromhjalmar leveragingdominantlanguageimagetagsforautomaticimageannotationinminorlanguages
_version_	1716530055953252353

Leveraging Dominant Language Image Tags for Automatic Image Annotation in Minor Languages

Similar Items