Food Image Captioning with Verb-Noun Pairs Empowered by Joint Correlation

碩士 === 國立中正大學 === 資訊工程研究所 === 103 === Studies of image captioning explosively emerge in recent two years. Though many elegant approaches have been proposed for general purposed image captioning, considering domain knowledge or specific description structure in a targeted domain still remains undisco...

Full description

Bibliographic Details
Main Authors: LIN,JIA-HSING, 林家興
Other Authors: CHU,WEI-TA
Format: Others
Language:en_US
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/21674221727413201079
id ndltd-TW-103CCU00392104
record_format oai_dc
spelling ndltd-TW-103CCU003921042017-06-03T04:41:38Z http://ndltd.ncl.edu.tw/handle/21674221727413201079 Food Image Captioning with Verb-Noun Pairs Empowered by Joint Correlation 聯合模型相關性產生動詞名詞配對描述食物影像 LIN,JIA-HSING 林家興 碩士 國立中正大學 資訊工程研究所 103 Studies of image captioning explosively emerge in recent two years. Though many elegant approaches have been proposed for general purposed image captioning, considering domain knowledge or specific description structure in a targeted domain still remains undiscovered. In this thesis, we concentrate on food image captioning where a food image is better described by not only what food it is but also how it was cooked. We propose neural networks to jointly consider multiple factors, i.e., food recognition, ingredient recognition, and cooking method recognition, and verify that recognition performance can be improved by taking multiple factors into account. With these three factors, food image captions composed of verb-noun pairs (usually cooking method followed by ingredients) can be generated. We demonstrate effectiveness of the proposed methods from various viewpoints, and believe this would be a better way to describe food images in contrast to general-purposed image captioning. CHU,WEI-TA 朱威達 2016 學位論文 ; thesis 27 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立中正大學 === 資訊工程研究所 === 103 === Studies of image captioning explosively emerge in recent two years. Though many elegant approaches have been proposed for general purposed image captioning, considering domain knowledge or specific description structure in a targeted domain still remains undiscovered. In this thesis, we concentrate on food image captioning where a food image is better described by not only what food it is but also how it was cooked. We propose neural networks to jointly consider multiple factors, i.e., food recognition, ingredient recognition, and cooking method recognition, and verify that recognition performance can be improved by taking multiple factors into account. With these three factors, food image captions composed of verb-noun pairs (usually cooking method followed by ingredients) can be generated. We demonstrate effectiveness of the proposed methods from various viewpoints, and believe this would be a better way to describe food images in contrast to general-purposed image captioning.
author2 CHU,WEI-TA
author_facet CHU,WEI-TA
LIN,JIA-HSING
林家興
author LIN,JIA-HSING
林家興
spellingShingle LIN,JIA-HSING
林家興
Food Image Captioning with Verb-Noun Pairs Empowered by Joint Correlation
author_sort LIN,JIA-HSING
title Food Image Captioning with Verb-Noun Pairs Empowered by Joint Correlation
title_short Food Image Captioning with Verb-Noun Pairs Empowered by Joint Correlation
title_full Food Image Captioning with Verb-Noun Pairs Empowered by Joint Correlation
title_fullStr Food Image Captioning with Verb-Noun Pairs Empowered by Joint Correlation
title_full_unstemmed Food Image Captioning with Verb-Noun Pairs Empowered by Joint Correlation
title_sort food image captioning with verb-noun pairs empowered by joint correlation
publishDate 2016
url http://ndltd.ncl.edu.tw/handle/21674221727413201079
work_keys_str_mv AT linjiahsing foodimagecaptioningwithverbnounpairsempoweredbyjointcorrelation
AT línjiāxìng foodimagecaptioningwithverbnounpairsempoweredbyjointcorrelation
AT linjiahsing liánhémóxíngxiāngguānxìngchǎnshēngdòngcímíngcípèiduìmiáoshùshíwùyǐngxiàng
AT línjiāxìng liánhémóxíngxiāngguānxìngchǎnshēngdòngcímíngcípèiduìmiáoshùshíwùyǐngxiàng
_version_ 1718455000267489280