Facial Expression Recognition Based on Weighted-Cluster Loss and Deep Transfer Learning Using a Highly Imbalanced Dataset

Facial expression recognition (FER) is a challenging problem in the fields of pattern recognition and computer vision. The recent success of convolutional neural networks (CNNs) in object detection and object segmentation tasks has shown promise in building an automatic deep CNN-based FER model. How...

Full description

Bibliographic Details
Main Authors: Quan T. Ngo, Seokhoon Yoon
Format: Article
Language:English
Published: MDPI AG 2020-05-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/20/9/2639
id doaj-d7c594112460432fbcf22809272cf85b
record_format Article
spelling doaj-d7c594112460432fbcf22809272cf85b2020-11-25T03:29:39ZengMDPI AGSensors1424-82202020-05-01202639263910.3390/s20092639Facial Expression Recognition Based on Weighted-Cluster Loss and Deep Transfer Learning Using a Highly Imbalanced DatasetQuan T. Ngo0Seokhoon Yoon1Department of Electrical and Computer Engineering, University of Ulsan, Ulsan 44610, KoreaDepartment of Electrical and Computer Engineering, University of Ulsan, Ulsan 44610, KoreaFacial expression recognition (FER) is a challenging problem in the fields of pattern recognition and computer vision. The recent success of convolutional neural networks (CNNs) in object detection and object segmentation tasks has shown promise in building an automatic deep CNN-based FER model. However, in real-world scenarios, performance degrades dramatically owing to the great diversity of factors unrelated to facial expressions, and due to a lack of training data and an intrinsic imbalance in the existing facial emotion datasets. To tackle these problems, this paper not only applies deep transfer learning techniques, but also proposes a novel loss function called weighted-cluster loss, which is used during the fine-tuning phase. Specifically, the weighted-cluster loss function simultaneously improves the intra-class compactness and the inter-class separability by learning a class center for each emotion class. It also takes the imbalance in a facial expression dataset into account by giving each emotion class a weight based on its proportion of the total number of images. In addition, a recent, successful deep CNN architecture, pre-trained in the task of face identification with the VGGFace2 database from the Visual Geometry Group at Oxford University, is employed and fine-tuned using the proposed loss function to recognize eight basic facial emotions from the AffectNet database of facial expression, valence, and arousal computing in the wild. Experiments on an AffectNet real-world facial dataset demonstrate that our method outperforms the baseline CNN models that use either weighted-softmax loss or center loss.https://www.mdpi.com/1424-8220/20/9/2639facial expression recognitiondeep convolutional neural networktransfer learningauxiliary lossweighted lossclass center
collection DOAJ
language English
format Article
sources DOAJ
author Quan T. Ngo
Seokhoon Yoon
spellingShingle Quan T. Ngo
Seokhoon Yoon
Facial Expression Recognition Based on Weighted-Cluster Loss and Deep Transfer Learning Using a Highly Imbalanced Dataset
Sensors
facial expression recognition
deep convolutional neural network
transfer learning
auxiliary loss
weighted loss
class center
author_facet Quan T. Ngo
Seokhoon Yoon
author_sort Quan T. Ngo
title Facial Expression Recognition Based on Weighted-Cluster Loss and Deep Transfer Learning Using a Highly Imbalanced Dataset
title_short Facial Expression Recognition Based on Weighted-Cluster Loss and Deep Transfer Learning Using a Highly Imbalanced Dataset
title_full Facial Expression Recognition Based on Weighted-Cluster Loss and Deep Transfer Learning Using a Highly Imbalanced Dataset
title_fullStr Facial Expression Recognition Based on Weighted-Cluster Loss and Deep Transfer Learning Using a Highly Imbalanced Dataset
title_full_unstemmed Facial Expression Recognition Based on Weighted-Cluster Loss and Deep Transfer Learning Using a Highly Imbalanced Dataset
title_sort facial expression recognition based on weighted-cluster loss and deep transfer learning using a highly imbalanced dataset
publisher MDPI AG
series Sensors
issn 1424-8220
publishDate 2020-05-01
description Facial expression recognition (FER) is a challenging problem in the fields of pattern recognition and computer vision. The recent success of convolutional neural networks (CNNs) in object detection and object segmentation tasks has shown promise in building an automatic deep CNN-based FER model. However, in real-world scenarios, performance degrades dramatically owing to the great diversity of factors unrelated to facial expressions, and due to a lack of training data and an intrinsic imbalance in the existing facial emotion datasets. To tackle these problems, this paper not only applies deep transfer learning techniques, but also proposes a novel loss function called weighted-cluster loss, which is used during the fine-tuning phase. Specifically, the weighted-cluster loss function simultaneously improves the intra-class compactness and the inter-class separability by learning a class center for each emotion class. It also takes the imbalance in a facial expression dataset into account by giving each emotion class a weight based on its proportion of the total number of images. In addition, a recent, successful deep CNN architecture, pre-trained in the task of face identification with the VGGFace2 database from the Visual Geometry Group at Oxford University, is employed and fine-tuned using the proposed loss function to recognize eight basic facial emotions from the AffectNet database of facial expression, valence, and arousal computing in the wild. Experiments on an AffectNet real-world facial dataset demonstrate that our method outperforms the baseline CNN models that use either weighted-softmax loss or center loss.
topic facial expression recognition
deep convolutional neural network
transfer learning
auxiliary loss
weighted loss
class center
url https://www.mdpi.com/1424-8220/20/9/2639
work_keys_str_mv AT quantngo facialexpressionrecognitionbasedonweightedclusterlossanddeeptransferlearningusingahighlyimbalanceddataset
AT seokhoonyoon facialexpressionrecognitionbasedonweightedclusterlossanddeeptransferlearningusingahighlyimbalanceddataset
_version_ 1724577941305688064