“Never fry carrots without chopping” Generating Cooking Recipes from Cooking Videos Using Deep Learning Considering Previous Process

Research on deep-training captioning models that modify the natural-language contents of images and moving images has produced considerable results and attracted attention in recent years. In this research, we aim to generate recipe sentences from cooking videos acquired from YouTube. We treat this...

Full description

Bibliographic Details
Main Authors: Tatsuki Fujii, Yuichi Sei, Yasuyuki Tahara, Ryohei Orihara, Akihiko Ohsuga
Format: Article
Language:English
Published: Atlantis Press
Series:International Journal of Networked and Distributed Computing (IJNDC)
Subjects:
Online Access:https://www.atlantis-press.com/article/125913572/view
Description
Summary:Research on deep-training captioning models that modify the natural-language contents of images and moving images has produced considerable results and attracted attention in recent years. In this research, we aim to generate recipe sentences from cooking videos acquired from YouTube. We treat this as an image-captioning task and propose two methods suitable for the work. We propose a method that adds a vector of a sentence already generated in the same recipe to the input of a captioning model. Then, we compare generated and correct sentences to calculate scores. We also propose a data-processing method to improve accuracy. We use several widely used metrics to evaluate image-captioning problems. We then train the same data with the simplest encoder–decoder model, compare it with correct recipe sentences, and calculate the metrics. The results indicate that our proposed methods help increase accuracy.
ISSN:2211-7946