Deep Visual Semantic Transform Model Learning from Multi-Label Images
碩士 === 國立臺灣師範大學 === 資訊工程學系 === 105 === Learning the relation between images and text semantics has been an important problem in the field of machine learning and computer vision. This paper addresses this problem. We observe that there is a semantic relation between texts, for example, “sky” and “cl...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2017
|
Online Access: | http://ndltd.ncl.edu.tw/handle/48kv54 |
id |
ndltd-TW-105NTNU5392026 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-105NTNU53920262019-05-15T23:46:59Z http://ndltd.ncl.edu.tw/handle/48kv54 Deep Visual Semantic Transform Model Learning from Multi-Label Images 從多標籤圖像學習之深層視覺語意轉換模型 Lee, Yi-Nan 李奕男 碩士 國立臺灣師範大學 資訊工程學系 105 Learning the relation between images and text semantics has been an important problem in the field of machine learning and computer vision. This paper addresses this problem. We observe that there is a semantic relation between texts, for example, “sky” and “cloud” have a close semantic relation, and “sky” and “car” have a weak semantic relation. We suppose the semantic relation between texts can be different depending on images. For example, an image contains both sky and car. The word “sky” and “car” are initially semantically irrelevant, but may have a connection because of the image containing these concepts. Therefore, we propose a Convolutional Neural Network based model to link the semantic relation between an image and its text labels. The main difference between our work and existing visual semantic embedding models is that the output of our model is a linear transformation function. In other words, each input image is treated as a function to determine the relation between each word and the image, and to predict the possible labels for the image. Finally, this model is validated on the NUS-WIDE dataset and the experimental results show that the model has a great performance on predicting labels for images. Yeh, Mei-Chen 葉梅珍 2017 學位論文 ; thesis 42 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣師範大學 === 資訊工程學系 === 105 === Learning the relation between images and text semantics has been an important problem in the field of machine learning and computer vision. This paper addresses this problem. We observe that there is a semantic relation between texts, for example, “sky” and “cloud” have a close semantic relation, and “sky” and “car” have a weak semantic relation. We suppose the semantic relation between texts can be different depending on images. For example, an image contains both sky and car. The word “sky” and “car” are initially semantically irrelevant, but may have a connection because of the image containing these concepts. Therefore, we propose a Convolutional Neural Network based model to link the semantic relation between an image and its text labels. The main difference between our work and existing visual semantic embedding models is that the output of our model is a linear transformation function. In other words, each input image is treated as a function to determine the relation between each word and the image, and to predict the possible labels for the image. Finally, this model is validated on the NUS-WIDE dataset and the experimental results show that the model has a great performance on predicting labels for images.
|
author2 |
Yeh, Mei-Chen |
author_facet |
Yeh, Mei-Chen Lee, Yi-Nan 李奕男 |
author |
Lee, Yi-Nan 李奕男 |
spellingShingle |
Lee, Yi-Nan 李奕男 Deep Visual Semantic Transform Model Learning from Multi-Label Images |
author_sort |
Lee, Yi-Nan |
title |
Deep Visual Semantic Transform Model Learning from Multi-Label Images |
title_short |
Deep Visual Semantic Transform Model Learning from Multi-Label Images |
title_full |
Deep Visual Semantic Transform Model Learning from Multi-Label Images |
title_fullStr |
Deep Visual Semantic Transform Model Learning from Multi-Label Images |
title_full_unstemmed |
Deep Visual Semantic Transform Model Learning from Multi-Label Images |
title_sort |
deep visual semantic transform model learning from multi-label images |
publishDate |
2017 |
url |
http://ndltd.ncl.edu.tw/handle/48kv54 |
work_keys_str_mv |
AT leeyinan deepvisualsemantictransformmodellearningfrommultilabelimages AT lǐyìnán deepvisualsemantictransformmodellearningfrommultilabelimages AT leeyinan cóngduōbiāoqiāntúxiàngxuéxízhīshēncéngshìjuéyǔyìzhuǎnhuànmóxíng AT lǐyìnán cóngduōbiāoqiāntúxiàngxuéxízhīshēncéngshìjuéyǔyìzhuǎnhuànmóxíng |
_version_ |
1719153863993327616 |