Summary: | We propose novel multi-order statistical descriptors which can be used for high speed object classification or face recognition from videos or image sets. We represent each gallery set with a global second-order statistic which captures correlated global variations in all feature directions as well as the common set structure. A lightweight descriptor is then constructed by efficiently compacting the second-order statistic using Cholesky decomposition. We then enrich the descriptor with the first-order statistic of the gallery set to further enhance the representation power. By projecting the descriptor into a low-dimensional discriminant subspace, we obtain further dimensionality reduction, while the discrimination power of the proposed representation is still preserved. Therefore, our method represents a complex image set by a single descriptor having significantly reduced dimensionality. We apply the proposed algorithm on image set and video-based face and periocular biometric identification, object category recognition, and hand gesture recognition. Experiments on six benchmark data sets validate that the proposed method achieves significantly better classification accuracy with lower computational complexity than the existing techniques. The proposed compact representations can be used for real-time object classification and face recognition in videos.
|