ARTICLE mccowan-rr-03-27b/IDIAP Automatic Analysis of Multimodal Group Actions in Meetings McCowan, Iain A. Gatica-Perez, Daniel Bengio, Samy Lathoud, Guillaume Barnard, Mark Zhang, Dong EXTERNAL https://publications.idiap.ch/attachments/reports/2003/mccowan-03-27.pdf PUBLIC https://publications.idiap.ch/index.php/publications/showcite/mccowan-rr-03-27 Related documents IEEE Transactions on Pattern Analysis and Machine Intelligence (to appear) 2004 To appear. This paper investigates the recognition of group actions in meetings. A statistical framework is proposed in which group actions result from the interactions of the individual participants. The group actions are modelled using different HMM-based approaches, where the observations are provided by a set of audio-visual features monitoring the actions of individuals. Experiments demonstrate the importance of taking interactions into account in modelling the group actions. It is also shown that the visual modality contains useful information, even for predominantly audio-based events, motivating a multimodal approach to meeting analysis. REPORT mccowan-rr-03-27/IDIAP Automatic Analysis of Multimodal Group Actions in Meetings McCowan, Iain A. Gatica-Perez, Daniel Bengio, Samy Lathoud, Guillaume Barnard, Mark Zhang, Dong EXTERNAL https://publications.idiap.ch/attachments/reports/2003/mccowan-03-27.pdf PUBLIC Idiap-RR-27-2003 2003 IDIAP Martigny, Switzerland To appear in IEEE Transactions of Pattern Analysis and Machine Intelligence This paper investigates the recognition of group actions in meetings. A statistical framework is proposed in which group actions result from the interactions of the individual participants. The group actions are modelled using different HMM-based approaches, where the observations are provided by a set of audio-visual features monitoring the actions of individuals. Experiments demonstrate the importance of taking interactions into account in modelling the group actions. It is also shown that the visual modality contains useful information, even for predominantly audio-based events, motivating a multimodal approach to meeting analysis.