CONF
lathoud04a/IDIAP
Unsupervised Location-Based Segmentation of Multi-Party Speech
Lathoud, Guillaume
McCowan, Iain A.
Odobez, Jean-Marc
EXTERNAL
https://publications.idiap.ch/attachments/papers/2004/lathoud04a.pdf
PUBLIC
https://publications.idiap.ch/index.php/publications/showcite/lathoud-rr-04-14
Related documents
Proceedings of the 2004 ICASSP-NIST Meeting Recognition Workshop
2004
Montreal, Canada
May 2004
IDIAP-RR 04-14
Accurate detection and segmentation of spontaneous multi-party speech is crucial for a variety of applications, including speech acquisition and recognition, as well as higher-level event recognition. However, the highly sporadic nature of spontaneous speech makes this task difficult. Moreover, multi-party speech contains many overlaps. We propose to attack this problem as a tracking task, using location cues only. In order to best deal with high sporadicity, we propose a novel, generic, short-term clustering algorithm that can track multiple objects for a low computational cost. The proposed approach is online, fully deterministic and can run in real-time. In an application to real meeting data, the algorithm produces high precision speech segmentation.
REPORT
lathoud-rr-04-14/IDIAP
Short-Term Spatio-Temporal Clustering of Sporadic and Concurrent Events
Lathoud, Guillaume
McCowan, Iain A.
Odobez, Jean-Marc
EXTERNAL
https://publications.idiap.ch/attachments/reports/2004/rr-04-14.pdf
PUBLIC
Idiap-RR-14-2004
2004
IDIAP
Martigny, Switzerland
Published in ``Proceedings of the 2004 ICASSP-NIST Meeting Recognition Workshop''
Accurate detection and segmentation of spontaneous multi-party speech is crucial for a variety of applications, including speech acquisition and recognition, as well as higher-level event recognition. However, the highly sporadic nature of spontaneous speech makes this task difficult. Moreover, multi-party speech contains many overlaps. We propose to attack this problem as a multitarget tracking task, using location cues only. In order to best deal with high sporadicity, we propose a novel, generic, short-term clustering algorithm that can track multiple objects for a low computational cost. The proposed approach is online, fully deterministic and can run in real-time. In an application to real meeting data, the algorithm produces high precision speech segmentation.