Simultaneous person attribute recognition using task-specific attention network on embedded devices

  • George Azzopardi
  • , Antonio Greco
  • , Alessia Saggese
  • , Bruno Vento*
  • *Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Pedestrian attribute recognition has become an important task in computer vision, particularly for retail and marketing and critical applications in security and surveillance. Despite its potential, achieving real-time performance on embedded devices while maintaining accuracy has been a significant challenge. In this paper, we propose a novel multi-task method for simultaneous person attribute recognition using task-specific attention network, which shares low-level representations across related tasks, reducing computational and memory requirements without compromising accuracy. In particular, we employ a spatial-channel attention mechanism to selectively focus on relevant regions without increasing the computational complexity of the backbone. Furthermore, we use a knowledge distillation technique to deal with missing labels and gradient normalization for dealing with task imbalances, since varying task difficulties lead to disproportionate gradient magnitudes during training. The experimental results demonstrate the effectiveness of our approach, achieving a mean accuracy of 0.889 while maintaining real-time performance at 114 frames per second on an embedded board with limited resources. These results highlight the practical viability and novelty of our system as a robust and scalable solution for pedestrian attribute recognition on embedded devices in real-world scenarios.
Original languageEnglish
Article number113508
Number of pages14
JournalEngineering Applications of Artificial Intelligence
Volume165
Issue numberB
Early online date12-Dec-2025
DOIs
Publication statusPublished - 1-Feb-2026

Fingerprint

Dive into the research topics of 'Simultaneous person attribute recognition using task-specific attention network on embedded devices'. Together they form a unique fingerprint.

Cite this