An End-to-End Task-Simplified and Anchor-Guided Deep Learning Framework for Image-Based Head Pose Estimation

Image-based Head Pose Estimation (HPE) from an arbitrary view is still challenging due to the complex imaging conditions as well as the intrinsic and extrinsic property of the faces. Different from existing HPE methods combining additional cues or tasks, this paper solves the HPE problem by relievin...

Full description

Bibliographic Details
Main Authors: Jing Li, Jiang Wang, Farhan Ullah
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9019692/
Description
Summary:Image-based Head Pose Estimation (HPE) from an arbitrary view is still challenging due to the complex imaging conditions as well as the intrinsic and extrinsic property of the faces. Different from existing HPE methods combining additional cues or tasks, this paper solves the HPE problem by relieving problem complexity. Our method integrates the deep Task-Simplification oriented Image Regularization (TSIR) module with the Anchor-Guided Pose Estimation (AGPE) module, and formulate the HPE problem into a unified end-to-end learning framework. In this paper, we define anchors as images that strictly obey the “gravity rule in camera”, which follows the assumption that camera coordinate of the vertical axis should always be consistent with that of the local head coordinate. We formulate image pair as the regularized image produced by TSIR along with its anchor counterpart, both of which are fed into the AGPE module for estimating fine-grained head poses. This paper also proposes an Anchor-Guided Pairwise Loss (AGPL), which describes the interdependent relevance of poses between each pair of images. The proposed method is evaluated and validated with sufficient experiments which show its effectiveness. Comprehensive experiments show that our approach outperforms the state-of-the-art image-based methods on both indoor and outdoor datasets.
ISSN:2169-3536