CONF mccowan-rr-02-59b/IDIAP Modeling Human Interaction in Meetings McCowan, Iain A. Bengio, Samy Gatica-Perez, Daniel Lathoud, Guillaume Monay, Florent Moore, Darren Wellner, Pierre Bourlard, Hervé EXTERNAL https://publications.idiap.ch/attachments/reports/2002/rr02-59.pdf PUBLIC https://publications.idiap.ch/index.php/publications/showcite/mccowan-rr-02-59 Related documents Proceedings of International Conference on Acoustics, Speech and Signal Processing 2003 Hong Kong April 2003 IDIAP-RR 02-59 This paper investigates the recognition of group actions in meetings by modeling the joint behaviour of participants. Many meeting actions, such as presentations, discussions and consensus, are characterised by similar or complementary behaviour across participants. Recognising these meaningful actions is an important step towards the goal of providing effective browsing and summarisation of processed meetings. In this work, a corpus of meetings was collected in a room equipped with a number of microphones and cameras. The corpus was labeled in terms of a predefined set of meeting actions characterised by global behaviour. In experiments, audio and visual features for each participant are extracted from the raw data and the interaction of participants is modeled using HMM-based approaches. Initial results on the corpus demonstrate the ability of the system to recognise the set of meeting actions. REPORT mccowan-rr-02-59/IDIAP Modeling Human Interaction in Meetings McCowan, Iain A. Bengio, Samy Gatica-Perez, Daniel Lathoud, Guillaume Monay, Florent Moore, Darren Wellner, Pierre Bourlard, Hervé EXTERNAL https://publications.idiap.ch/attachments/reports/2002/rr02-59.pdf PUBLIC Idiap-RR-59-2002 2002 IDIAP Martigny, Switzerland Published in Proceedings of ICASSP This paper investigates the recognition of group actions in meetings by modeling the joint behaviour of participants. Many meeting actions, such as presentations, discussions and consensus, are characterised by similar or complementary behaviour across participants. Recognising these meaningful actions is an important step towards the goal of providing effective browsing and summarisation of processed meetings. In this work, a corpus of meetings was collected in a room equipped with a number of microphones and cameras. The corpus was labeled in terms of a predefined set of meeting actions characterised by global behaviour. In experiments, audio and visual features for each participant are extracted from the raw data and the interaction of participants is modeled using HMM-based approaches. Initial results on the corpus demonstrate the ability of the system to recognise the set of meeting actions.