A benchmark of dynamic versus static methods for facial action unit detection

Abstract Action Units activation is a set of local individual facial muscle parts that occur in time constituting a natural facial expression event. AUs occurrence activation detection can be inferred as temporally consecutive evolving movements of these parts. Detecting AUs automatically can provid...

Full description

Bibliographic Details
Main Authors:	L. Alharbawee, N. Pugeault
Format:	Article
Language:	English
Published:	Wiley 2021-05-01
Series:	The Journal of Engineering
Online Access:	https://doi.org/10.1049/tje2.12001

id	doaj-e3095058f95743baaab8b1bcc5f13afb
record_format	Article
spelling	doaj-e3095058f95743baaab8b1bcc5f13afb2021-05-24T15:08:03ZengWileyThe Journal of Engineering2051-33052021-05-012021525226610.1049/tje2.12001A benchmark of dynamic versus static methods for facial action unit detectionL. Alharbawee0N. Pugeault1College of Engineering Mathematics and Physical Sciences University of Exeter Exeter UKCollege of Engineering Mathematics and Physical Sciences University of Exeter Exeter UKAbstract Action Units activation is a set of local individual facial muscle parts that occur in time constituting a natural facial expression event. AUs occurrence activation detection can be inferred as temporally consecutive evolving movements of these parts. Detecting AUs automatically can provide explicit benefits since it considers both static and dynamic facial features. Our work is divided into three contributions: first, we extracted the features from Local Binary Patterns, Local Phase Quantisation, and dynamic texture descriptor LPQTOP with two distinct leveraged network models from different CNN architectures for local deep visual learning for AU image analysis. Second, cascading the LPQTOP feature vector with Long Short‐Term Memory is used for coding longer term temporal information. Next, we discovered the importance of stacking LSTM on top of CNN for learning temporal information in combining the spatially and temporally schemes simultaneously. Also, we hypothesised that using an unsupervised Slow Feature Analysis method is able to leach invariant information from dynamic textures. Third, we compared continuous scoring predictions between LPQTOP and SVM, LPQTOP with LSTM, and AlexNet. A competitive substantial performance evaluation was carried out on the Enhanced CK dataset. Overall, the results indicate that CNN is very promising and surpassed all other methodshttps://doi.org/10.1049/tje2.12001
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	L. Alharbawee N. Pugeault
spellingShingle	L. Alharbawee N. Pugeault A benchmark of dynamic versus static methods for facial action unit detection The Journal of Engineering
author_facet	L. Alharbawee N. Pugeault
author_sort	L. Alharbawee
title	A benchmark of dynamic versus static methods for facial action unit detection
title_short	A benchmark of dynamic versus static methods for facial action unit detection
title_full	A benchmark of dynamic versus static methods for facial action unit detection
title_fullStr	A benchmark of dynamic versus static methods for facial action unit detection
title_full_unstemmed	A benchmark of dynamic versus static methods for facial action unit detection
title_sort	benchmark of dynamic versus static methods for facial action unit detection
publisher	Wiley
series	The Journal of Engineering
issn	2051-3305
publishDate	2021-05-01
description	Abstract Action Units activation is a set of local individual facial muscle parts that occur in time constituting a natural facial expression event. AUs occurrence activation detection can be inferred as temporally consecutive evolving movements of these parts. Detecting AUs automatically can provide explicit benefits since it considers both static and dynamic facial features. Our work is divided into three contributions: first, we extracted the features from Local Binary Patterns, Local Phase Quantisation, and dynamic texture descriptor LPQTOP with two distinct leveraged network models from different CNN architectures for local deep visual learning for AU image analysis. Second, cascading the LPQTOP feature vector with Long Short‐Term Memory is used for coding longer term temporal information. Next, we discovered the importance of stacking LSTM on top of CNN for learning temporal information in combining the spatially and temporally schemes simultaneously. Also, we hypothesised that using an unsupervised Slow Feature Analysis method is able to leach invariant information from dynamic textures. Third, we compared continuous scoring predictions between LPQTOP and SVM, LPQTOP with LSTM, and AlexNet. A competitive substantial performance evaluation was carried out on the Enhanced CK dataset. Overall, the results indicate that CNN is very promising and surpassed all other methods
url	https://doi.org/10.1049/tje2.12001
work_keys_str_mv	AT lalharbawee abenchmarkofdynamicversusstaticmethodsforfacialactionunitdetection AT npugeault abenchmarkofdynamicversusstaticmethodsforfacialactionunitdetection AT lalharbawee benchmarkofdynamicversusstaticmethodsforfacialactionunitdetection AT npugeault benchmarkofdynamicversusstaticmethodsforfacialactionunitdetection
_version_	1721428613936447488

A benchmark of dynamic versus static methods for facial action unit detection

Similar Items