CONF
Korchagin_MMM_2012/IDIAP
Multimodal Cue Detection Engine for Orchestrated Entertainment
Korchagin, Danil
Duffner, Stefan
Motlicek, Petr
Scheffler, Carl
data analysis
multimodal signal processing
sensor fusion
EXTERNAL
https://publications.idiap.ch/attachments/papers/2011/Korchagin_MMM_2012.pdf
PUBLIC
https://publications.idiap.ch/index.php/publications/showcite/Korchagin_Idiap-RR-34-2011
Related documents
Proceedings International Conference on MultiMedia Modeling
Klagenfurt, Austria
2012
In this paper, we describe a low delay real-time multimodal cue detection engine for a living room environment. The system is designed to be used in open, unconstrained environments to allow multiple people to enter, interact and leave the observable world with no constraints. It comprises detection and tracking of up to 4 faces, estimation of head poses and visual focus of attention, detection and localisation of verbal and paralinguistic events, their association and fusion. The system is designed as a flexible component to be used in conjunction with an orchestrated video conferencing system to improve the overall experience of interaction between spatially separated families and friends. Reduced latency levels achieved to date have shown improved responsiveness of the system.
REPORT
Korchagin_Idiap-RR-34-2011/IDIAP
Multimodal Cue Detection Engine for Orchestrated Entertainment
Korchagin, Danil
Duffner, Stefan
Motlicek, Petr
Scheffler, Carl
data analysis
multimodal signal processing
sensor fusion
EXTERNAL
https://publications.idiap.ch/attachments/reports/2011/Korchagin_Idiap-RR-34-2011.pdf
PUBLIC
Idiap-RR-34-2011
2011
Idiap
Martigny, Switzerland
October 2011
In this paper, we describe a low delay real-time multimodal cue detection engine for a living room environment. The system is designed to be used in open, unconstrained environments to allow multiple people to enter, interact and leave the observable world with no constraints. It comprises detection and tracking of up to 4 faces, estimation of head poses and visual focus of attention, detection and localisation of verbal and paralinguistic events, their association and fusion. The system is designed as a flexible component to be used in conjunction with an orchestrated video conferencing system to improve the overall experience of interaction between spatially separated families and friends. Reduced latency levels achieved to date have shown improved responsiveness of the system.