Hybrid deep neural network for Bangla automated image descriptor

Automated image to text generation is a computationally challenging computer vision task which requires sufficient comprehension of both syntactic and semantic meaning of an image to generate a meaningful description. Until recent times, it has been studied to a limited scope due to the lack of visu...

Full description

Bibliographic Details
Main Authors: Md Asifuzzaman Jishan, Khan Raqib Mahmud, Abul Kalam Al Azad, Md Shahabub Alam, Anif Minhaz Khan
Format: Article
Language:English
Published: Universitas Ahmad Dahlan 2020-07-01
Series:IJAIN (International Journal of Advances in Intelligent Informatics)
Subjects:
Online Access:http://ijain.org/index.php/IJAIN/article/view/499
id doaj-35f729bb54184734afbaca24275e1c9c
record_format Article
spelling doaj-35f729bb54184734afbaca24275e1c9c2020-11-25T02:59:45ZengUniversitas Ahmad DahlanIJAIN (International Journal of Advances in Intelligent Informatics)2442-65712548-31612020-07-016210912210.26555/ijain.v6i2.499147Hybrid deep neural network for Bangla automated image descriptorMd Asifuzzaman Jishan0Khan Raqib Mahmud1Abul Kalam Al Azad2Md Shahabub Alam3Anif Minhaz Khan4Department of Statistics, Technische Universität DortmundUniversity of Liberal Arts BangladeshUniversity of Liberal Arts BangladeshDepartment of Statistics, Technische Universität DortmundDepartment of Statistics, Technische Universität DortmundAutomated image to text generation is a computationally challenging computer vision task which requires sufficient comprehension of both syntactic and semantic meaning of an image to generate a meaningful description. Until recent times, it has been studied to a limited scope due to the lack of visual-descriptor dataset and functional models to capture intrinsic complexities involving features of an image. In this study, a novel dataset was constructed by generating Bangla textual descriptor from visual input, called Bangla Natural Language Image to Text (BNLIT), incorporating 100 classes with annotation. A deep neural network-based image captioning model was proposed to generate image description. The model employs Convolutional Neural Network (CNN) to classify the whole dataset, while Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) capture the sequential semantic representation of text-based sentences and generate pertinent description based on the modular complexities of an image. When tested on the new dataset, the model accomplishes significant enhancement of centrality execution for image semantic recovery assignment. For the experiment of that task, we implemented a hybrid image captioning model, which achieved a remarkable result for a new self-made dataset, and that task was new for the Bangladesh perspective. In brief, the model provided benchmark precision in the characteristic Bangla syntax reconstruction and comprehensive numerical analysis of the model execution results on the dataset.http://ijain.org/index.php/IJAIN/article/view/499convolutional neural networkhybrid recurrent neural networklong short-term memorybi-directional rnnnatural language descriptors
collection DOAJ
language English
format Article
sources DOAJ
author Md Asifuzzaman Jishan
Khan Raqib Mahmud
Abul Kalam Al Azad
Md Shahabub Alam
Anif Minhaz Khan
spellingShingle Md Asifuzzaman Jishan
Khan Raqib Mahmud
Abul Kalam Al Azad
Md Shahabub Alam
Anif Minhaz Khan
Hybrid deep neural network for Bangla automated image descriptor
IJAIN (International Journal of Advances in Intelligent Informatics)
convolutional neural network
hybrid recurrent neural network
long short-term memory
bi-directional rnn
natural language descriptors
author_facet Md Asifuzzaman Jishan
Khan Raqib Mahmud
Abul Kalam Al Azad
Md Shahabub Alam
Anif Minhaz Khan
author_sort Md Asifuzzaman Jishan
title Hybrid deep neural network for Bangla automated image descriptor
title_short Hybrid deep neural network for Bangla automated image descriptor
title_full Hybrid deep neural network for Bangla automated image descriptor
title_fullStr Hybrid deep neural network for Bangla automated image descriptor
title_full_unstemmed Hybrid deep neural network for Bangla automated image descriptor
title_sort hybrid deep neural network for bangla automated image descriptor
publisher Universitas Ahmad Dahlan
series IJAIN (International Journal of Advances in Intelligent Informatics)
issn 2442-6571
2548-3161
publishDate 2020-07-01
description Automated image to text generation is a computationally challenging computer vision task which requires sufficient comprehension of both syntactic and semantic meaning of an image to generate a meaningful description. Until recent times, it has been studied to a limited scope due to the lack of visual-descriptor dataset and functional models to capture intrinsic complexities involving features of an image. In this study, a novel dataset was constructed by generating Bangla textual descriptor from visual input, called Bangla Natural Language Image to Text (BNLIT), incorporating 100 classes with annotation. A deep neural network-based image captioning model was proposed to generate image description. The model employs Convolutional Neural Network (CNN) to classify the whole dataset, while Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) capture the sequential semantic representation of text-based sentences and generate pertinent description based on the modular complexities of an image. When tested on the new dataset, the model accomplishes significant enhancement of centrality execution for image semantic recovery assignment. For the experiment of that task, we implemented a hybrid image captioning model, which achieved a remarkable result for a new self-made dataset, and that task was new for the Bangladesh perspective. In brief, the model provided benchmark precision in the characteristic Bangla syntax reconstruction and comprehensive numerical analysis of the model execution results on the dataset.
topic convolutional neural network
hybrid recurrent neural network
long short-term memory
bi-directional rnn
natural language descriptors
url http://ijain.org/index.php/IJAIN/article/view/499
work_keys_str_mv AT mdasifuzzamanjishan hybriddeepneuralnetworkforbanglaautomatedimagedescriptor
AT khanraqibmahmud hybriddeepneuralnetworkforbanglaautomatedimagedescriptor
AT abulkalamalazad hybriddeepneuralnetworkforbanglaautomatedimagedescriptor
AT mdshahabubalam hybriddeepneuralnetworkforbanglaautomatedimagedescriptor
AT anifminhazkhan hybriddeepneuralnetworkforbanglaautomatedimagedescriptor
_version_ 1724701309356998656