logo Idiap Research Institute        
 [BibTeX] [Marc21]
MTGS: A Novel Framework for Multi-Person Temporal Gaze Following and Social Gaze Prediction
Type of publication: Conference paper
Citation: Gupta_NEURIPS_2024
Publication status: Accepted
Booktitle: 38th Conf. on Neural Information Processing System
Year: 2024
Month: December
Abstract: Gaze following and social gaze prediction are fundamental tasks providing insights into human communication behaviors, intent, and social interactions. Most previous approaches addressed these tasks separately, either by designing highly specialized social gaze models that do not generalize to other social gaze tasks or by considering social gaze inference as an ad-hoc post-processing of the gaze following task. Furthermore, the vast majority of gaze following approaches have proposed models that can handle only one person at a time and are static, therefore failing to take advantage of social interactions and temporal dynamics. In this paper, we address these limitations and introduce a novel framework to jointly predict the gaze target and social gaze label for all people in the scene. It comprises (i) a temporal, transformer-based architecture that, in addition to frame tokens, handles person- specific tokens capturing the gaze information related to each individual; (ii) a new dataset, VSGaze, built from multiple gaze following and social gaze datasets by extending and validating head detections and tracks, and unifying annotation types. We demonstrate that our model can address and benefit from training on all tasks jointly, achieving state-of-the-art results for multi-person gaze following and social gaze prediction. Our annotations and code will be made publicly available.
Keywords:
Projects Idiap
AI4Autism
Authors Gupta, Anshul
Tafasca, Samy
Farkhondeh, Arya
Vuillecard, Pierre
Odobez, Jean-Marc
Added by: [UNK]
Total mark: 0
Attachments
  • Gupta_NEURIPS_2024.pdf
Notes