Abstract
Facial Action Unit (AU) detection is of major importance in a broad range of artificial intelligence applications such as healthcare, Facial Expression Recognition (FER), and mental state analysis. In this paper, we present an innovative, resource-efficient facial AU detection model, embedding both spatial and channel attention mechanisms into a convolutional neural network (CNN). Along with a unique data input system leveraging image data and binary-encoded AU activation labels, our model enhances AU detection capabilities while simultaneously offering interpretability for FER systems. In contrast to existing state-of-the-art models, our proposal’s streamlined architecture, combined with superior performance, establishes it as an ideal solution for resource-limited environments like mobile and embedded systems with computational constraints. The model was trained and evaluated utilizing the BP4D, CK+, DISFA, FER2013+, and RAF-DB datasets, with the latter two being particularly significant as they represent wild datasets for facial expression recognition. These datasets encompass ground truth emotions matched with corresponding AU activations according to the Facial Action Coding System. Various metrics, including F1 score, accuracy, and Euclidean distance, showcase its effectiveness in AU detection and interpretability.
Original language | English |
---|---|
Pages (from-to) | 117954-117970 |
Number of pages | 17 |
Journal | IEEE Access |
Volume | 11 |
DOIs | |
Publication status | Published - 16 Oct 2023 |
Keywords
- facial action unit detection
- lightweight AU detection
- attention mechanism
- convolutional neural networks (CNN)
- explainable artificial intelligence (XAI)
- eXplainable FER system