CONF lathoud04a/IDIAP Unsupervised Location-Based Segmentation of Multi-Party Speech Lathoud, Guillaume McCowan, Iain A. Odobez, Jean-Marc EXTERNAL http://publications.idiap.ch/attachments/papers/2004/lathoud04a.pdf PUBLIC http://publications.idiap.ch/index.php/publications/showcite/lathoud-rr-04-14 Related documents Proceedings of the 2004 ICASSP-NIST Meeting Recognition Workshop 2004 Montreal, Canada May 2004 IDIAP-RR 04-14 Accurate detection and segmentation of spontaneous multi-party speech is crucial for a variety of applications, including speech acquisition and recognition, as well as higher-level event recognition. However, the highly sporadic nature of spontaneous speech makes this task difficult. Moreover, multi-party speech contains many overlaps. We propose to attack this problem as a tracking task, using location cues only. In order to best deal with high sporadicity, we propose a novel, generic, short-term clustering algorithm that can track multiple objects for a low computational cost. The proposed approach is online, fully deterministic and can run in real-time. In an application to real meeting data, the algorithm produces high precision speech segmentation. REPORT lathoud-rr-04-14/IDIAP Short-Term Spatio-Temporal Clustering of Sporadic and Concurrent Events Lathoud, Guillaume McCowan, Iain A. Odobez, Jean-Marc EXTERNAL http://publications.idiap.ch/attachments/reports/2004/rr-04-14.pdf PUBLIC Idiap-RR-14-2004 2004 IDIAP Martigny, Switzerland Published in ``Proceedings of the 2004 ICASSP-NIST Meeting Recognition Workshop'' Accurate detection and segmentation of spontaneous multi-party speech is crucial for a variety of applications, including speech acquisition and recognition, as well as higher-level event recognition. However, the highly sporadic nature of spontaneous speech makes this task difficult. Moreover, multi-party speech contains many overlaps. We propose to attack this problem as a multitarget tracking task, using location cues only. In order to best deal with high sporadicity, we propose a novel, generic, short-term clustering algorithm that can track multiple objects for a low computational cost. The proposed approach is online, fully deterministic and can run in real-time. In an application to real meeting data, the algorithm produces high precision speech segmentation.

</datafield>

<subfield code="a">lathoud04a/IDIAP</subfield>

</datafield>

<subfield code="a">Unsupervised Location-Based Segmentation of Multi-Party Speech</subfield>

</datafield>

<subfield code="a">Lathoud, Guillaume</subfield>

</datafield>

<subfield code="a">McCowan, Iain A.</subfield>

</datafield>

<subfield code="a">Odobez, Jean-Marc</subfield>

</datafield>

<subfield code="i">EXTERNAL</subfield>

<subfield code="u">http://publications.idiap.ch/attachments/papers/2004/lathoud04a.pdf</subfield>

<subfield code="x">PUBLIC</subfield>

</datafield>

<subfield code="u">http://publications.idiap.ch/index.php/publications/showcite/lathoud-rr-04-14</subfield>

<subfield code="z">Related documents</subfield>

</datafield>

<subfield code="a">Proceedings of the 2004 ICASSP-NIST Meeting Recognition Workshop</subfield>

</datafield>

<subfield code="a">Montreal, Canada</subfield>

</datafield>

</datafield>

<subfield code="a">IDIAP-RR 04-14</subfield>

</datafield>

<subfield code="a">Accurate detection and segmentation of spontaneous multi-party speech is crucial for a variety of applications, including speech acquisition and recognition, as well as higher-level event recognition. However, the highly sporadic nature of spontaneous speech makes this task difficult. Moreover, multi-party speech contains many overlaps. We propose to attack this problem as a tracking task, using location cues only. In order to best deal with high sporadicity, we propose a novel, generic, short-term clustering algorithm that can track multiple objects for a low computational cost. The proposed approach is online, fully deterministic and can run in real-time. In an application to real meeting data, the algorithm produces high precision speech segmentation.</subfield>

</datafield>

</record>

<subfield code="a">REPORT</subfield>

</datafield>

<subfield code="a">lathoud-rr-04-14/IDIAP</subfield>

</datafield>

<subfield code="a">Short-Term Spatio-Temporal Clustering of Sporadic and Concurrent Events</subfield>

</datafield>

<subfield code="a">Lathoud, Guillaume</subfield>

</datafield>

<subfield code="a">McCowan, Iain A.</subfield>

</datafield>

<subfield code="a">Odobez, Jean-Marc</subfield>

</datafield>

<subfield code="i">EXTERNAL</subfield>

<subfield code="u">http://publications.idiap.ch/attachments/reports/2004/rr-04-14.pdf</subfield>

<subfield code="x">PUBLIC</subfield>

</datafield>

<subfield code="a">Idiap-RR-14-2004</subfield>

</datafield>

<subfield code="b">IDIAP</subfield>

<subfield code="a">Martigny, Switzerland</subfield>

</datafield>

<subfield code="a">Published in ``Proceedings of the 2004 ICASSP-NIST Meeting Recognition Workshop''</subfield>

</datafield>

<subfield code="a">Accurate detection and segmentation of spontaneous multi-party speech is crucial for a variety of applications, including speech acquisition and recognition, as well as higher-level event recognition. However, the highly sporadic nature of spontaneous speech makes this task difficult. Moreover, multi-party speech contains many overlaps. We propose to attack this problem as a multitarget tracking task, using location cues only. In order to best deal with high sporadicity, we propose a novel, generic, short-term clustering algorithm that can track multiple objects for a low computational cost. The proposed approach is online, fully deterministic and can run in real-time. In an application to real meeting data, the algorithm produces high precision speech segmentation.</subfield>

</datafield>

</record>

</collection>