Saliency Maps-Based Convolutional Neural Networks for Facial Expression Recognition

Facial expression recognition (FER) is one of the important research contents in affective computing. It plays a key role in many application fields of human life. As a most common expression feature extraction method, the convolutional neural network (CNN) has the following main limitation. Due to...

Full description

Bibliographic Details
Main Author:	Qinglan Wei
Format:	Article
Language:	English
Published:	IEEE 2021-01-01
Series:	IEEE Access
Subjects:	Facial expression recognition saliency maps dilated convolution prior knowledge the convolutional neural network
Online Access:	https://ieeexplore.ieee.org/document/9438697/

id	doaj-78b81ab47a5345b0b7b5ddb6e63ffae6
record_format	Article
spelling	doaj-78b81ab47a5345b0b7b5ddb6e63ffae62021-05-27T23:04:25ZengIEEEIEEE Access2169-35362021-01-019762247623410.1109/ACCESS.2021.30826949438697Saliency Maps-Based Convolutional Neural Networks for Facial Expression RecognitionQinglan Wei0https://orcid.org/0000-0002-2710-0410School of Data Science and Intelligent Media, Communication University of China, Beijing, ChinaFacial expression recognition (FER) is one of the important research contents in affective computing. It plays a key role in many application fields of human life. As a most common expression feature extraction method, the convolutional neural network (CNN) has the following main limitation. Due to the fact that the CNN network lacks the visual attention guidance, when it gets expression information it brings background noises, resulting in the lower recognition accuracy. In order to simulate the attention mechanism in human visual system, a salient feature extraction model is proposed, including the dilated inception module, the Difference of Gaussian (DOG) module, and the multi-indicator saliency prediction module. This model can effectively reflect the key facial information through the increase of the receptive field, the acquisition of multiscale features, and the simulation of human vision. In addition, a novel FER method for one single person is proposed. With the prior knowledge of saliency maps and the multilayer deep features in the CNN network, the recognition accuracy is improved by obtaining more targeted and more complete deep expression information. The experimental results of saliency prediction, action unit (AU) detection, and smile intensity estimation on the CAT2000, the CK+, and the BP4D databases prove that the proposed method improves the FER performance and is more effective than the existing approaches.https://ieeexplore.ieee.org/document/9438697/Facial expression recognitionsaliency mapsdilated convolutionprior knowledgethe convolutional neural network
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Qinglan Wei
spellingShingle	Qinglan Wei Saliency Maps-Based Convolutional Neural Networks for Facial Expression Recognition IEEE Access Facial expression recognition saliency maps dilated convolution prior knowledge the convolutional neural network
author_facet	Qinglan Wei
author_sort	Qinglan Wei
title	Saliency Maps-Based Convolutional Neural Networks for Facial Expression Recognition
title_short	Saliency Maps-Based Convolutional Neural Networks for Facial Expression Recognition
title_full	Saliency Maps-Based Convolutional Neural Networks for Facial Expression Recognition
title_fullStr	Saliency Maps-Based Convolutional Neural Networks for Facial Expression Recognition
title_full_unstemmed	Saliency Maps-Based Convolutional Neural Networks for Facial Expression Recognition
title_sort	saliency maps-based convolutional neural networks for facial expression recognition
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2021-01-01
description	Facial expression recognition (FER) is one of the important research contents in affective computing. It plays a key role in many application fields of human life. As a most common expression feature extraction method, the convolutional neural network (CNN) has the following main limitation. Due to the fact that the CNN network lacks the visual attention guidance, when it gets expression information it brings background noises, resulting in the lower recognition accuracy. In order to simulate the attention mechanism in human visual system, a salient feature extraction model is proposed, including the dilated inception module, the Difference of Gaussian (DOG) module, and the multi-indicator saliency prediction module. This model can effectively reflect the key facial information through the increase of the receptive field, the acquisition of multiscale features, and the simulation of human vision. In addition, a novel FER method for one single person is proposed. With the prior knowledge of saliency maps and the multilayer deep features in the CNN network, the recognition accuracy is improved by obtaining more targeted and more complete deep expression information. The experimental results of saliency prediction, action unit (AU) detection, and smile intensity estimation on the CAT2000, the CK+, and the BP4D databases prove that the proposed method improves the FER performance and is more effective than the existing approaches.
topic	Facial expression recognition saliency maps dilated convolution prior knowledge the convolutional neural network
url	https://ieeexplore.ieee.org/document/9438697/
work_keys_str_mv	AT qinglanwei saliencymapsbasedconvolutionalneuralnetworksforfacialexpressionrecognition
_version_	1721425109088993280

Saliency Maps-Based Convolutional Neural Networks for Facial Expression Recognition

Similar Items