A Temporal Fusion Approach for Video Classification with Convolutional and LSTM Neural Networks Applied to Violence Detection

In modern smart cities, there is a quest for the highest level of integration and automation service. In the surveillance sector, one of the main challenges is to automate the analysis of videos in real-time to identify critical situations. This paper presents intelligent models based on Convolutio...

Full description

Bibliographic Details
Main Authors: Jean Phelipe de Oliveira Lima, Carlos Maurício Seródio Figueiredo
Format: Article
Language:English
Published: Asociación Española para la Inteligencia Artificial 2021-02-01
Series:Inteligencia Artificial
Subjects:
Online Access:https://journal.iberamia.org/index.php/intartif/article/view/573
id doaj-2cd6e88732b945c7ac7e2c1f86597b26
record_format Article
spelling doaj-2cd6e88732b945c7ac7e2c1f86597b262021-03-07T01:11:13ZengAsociación Española para la Inteligencia ArtificialInteligencia Artificial1137-36011988-30642021-02-01246710.4114/intartif.vol24iss67pp40-50A Temporal Fusion Approach for Video Classification with Convolutional and LSTM Neural Networks Applied to Violence DetectionJean Phelipe de Oliveira Lima 0Carlos Maurício Seródio Figueiredo1Universidade do Estado do Amazonas, BrazilUniversidade do Estado do Amazonas, Brazil In modern smart cities, there is a quest for the highest level of integration and automation service. In the surveillance sector, one of the main challenges is to automate the analysis of videos in real-time to identify critical situations. This paper presents intelligent models based on Convolutional Neural Networks (in which the MobileNet, InceptionV3 and VGG16 networks had used), LSTM networks and feedforward networks for the task of classifying videos under the classes "Violence" and "Non-Violence", using for this the RLVS database. Different data representations held used according to the Temporal Fusion techniques. The best outcome achieved was Accuracy and F1-Score of 0.91, a higher result compared to those found in similar researches for works conducted on the same database. https://journal.iberamia.org/index.php/intartif/article/view/573Applications of AIDeep LearningIntelligent Video ProcessingViolence Detection
collection DOAJ
language English
format Article
sources DOAJ
author Jean Phelipe de Oliveira Lima
Carlos Maurício Seródio Figueiredo
spellingShingle Jean Phelipe de Oliveira Lima
Carlos Maurício Seródio Figueiredo
A Temporal Fusion Approach for Video Classification with Convolutional and LSTM Neural Networks Applied to Violence Detection
Inteligencia Artificial
Applications of AI
Deep Learning
Intelligent Video Processing
Violence Detection
author_facet Jean Phelipe de Oliveira Lima
Carlos Maurício Seródio Figueiredo
author_sort Jean Phelipe de Oliveira Lima
title A Temporal Fusion Approach for Video Classification with Convolutional and LSTM Neural Networks Applied to Violence Detection
title_short A Temporal Fusion Approach for Video Classification with Convolutional and LSTM Neural Networks Applied to Violence Detection
title_full A Temporal Fusion Approach for Video Classification with Convolutional and LSTM Neural Networks Applied to Violence Detection
title_fullStr A Temporal Fusion Approach for Video Classification with Convolutional and LSTM Neural Networks Applied to Violence Detection
title_full_unstemmed A Temporal Fusion Approach for Video Classification with Convolutional and LSTM Neural Networks Applied to Violence Detection
title_sort temporal fusion approach for video classification with convolutional and lstm neural networks applied to violence detection
publisher Asociación Española para la Inteligencia Artificial
series Inteligencia Artificial
issn 1137-3601
1988-3064
publishDate 2021-02-01
description In modern smart cities, there is a quest for the highest level of integration and automation service. In the surveillance sector, one of the main challenges is to automate the analysis of videos in real-time to identify critical situations. This paper presents intelligent models based on Convolutional Neural Networks (in which the MobileNet, InceptionV3 and VGG16 networks had used), LSTM networks and feedforward networks for the task of classifying videos under the classes "Violence" and "Non-Violence", using for this the RLVS database. Different data representations held used according to the Temporal Fusion techniques. The best outcome achieved was Accuracy and F1-Score of 0.91, a higher result compared to those found in similar researches for works conducted on the same database.
topic Applications of AI
Deep Learning
Intelligent Video Processing
Violence Detection
url https://journal.iberamia.org/index.php/intartif/article/view/573
work_keys_str_mv AT jeanphelipedeoliveiralima atemporalfusionapproachforvideoclassificationwithconvolutionalandlstmneuralnetworksappliedtoviolencedetection
AT carlosmauricioserodiofigueiredo atemporalfusionapproachforvideoclassificationwithconvolutionalandlstmneuralnetworksappliedtoviolencedetection
AT jeanphelipedeoliveiralima temporalfusionapproachforvideoclassificationwithconvolutionalandlstmneuralnetworksappliedtoviolencedetection
AT carlosmauricioserodiofigueiredo temporalfusionapproachforvideoclassificationwithconvolutionalandlstmneuralnetworksappliedtoviolencedetection
_version_ 1724229488627154944