CONF
Motlicek_INTERSPEECH2010_2010/IDIAP
English Spoken Term Detection in Multilingual Recordings
Motlicek, Petr
Valente, Fabio
Garner, Philip N.
Confidence Measure (CM)
LVCSR
Out-Of-Language (OOL) detection
Spoken Term Detection (STD)
https://publications.idiap.ch/index.php/publications/showcite/Motlicek_Idiap-RR-21-2010
Related documents
ISCA - Proceedings of Interspeech, Makuhari, Japan, 2010
Makuhari, Japan
2010
September 2010
This paper investigates the automatic detection of English spoken terms in a multi-language scenario over real lecture recordings. Spoken Term Detection (STD) is based on an LVCSR where the output is represented in the form of word lattices. The lattices are then used to search the required terms. Processed lectures are mainly composed of English, French and Italian recordings where the language can also change within one recording. Therefore, the English STD system uses an Out-Of-Language (OOL) detection module to filter out non-English input segments. OOL detection is evaluated w.r.t. various confidence measures estimated from word lattices. Experimental studies of OOL detection followed by English STD are performed on several hours of multilingual recordings. Significant
improvement of OOL+STD over a stand-alone STD system is achieved (relatively more than 50% in EER). Finally, an additional modality (text slides in the form of PowerPoint presentations) is exploited to improve STD.
REPORT
Motlicek_Idiap-RR-21-2010/IDIAP
English Spoken Term Detection in Multilingual Recordings
Motlicek, Petr
Valente, Fabio
Garner, Philip N.
https://publications.idiap.ch/index.php/publications/showcite/Motlicek_INTERSPEECH2010_2010
Related documents
Idiap-RR-21-2010
2010
Idiap
Rue Marconi 19, Martigny, 1920, Switzerland
July 2010
This paper investigates the automatic detection of English spoken terms in a multi-language scenario over real lecture recordings. Spoken Term Detection (STD) is based on an LVCSR where the output is represented in the form of word lattices. The lattices are then used to search the required terms. Processed lectures are mainly composed of English, French and Italian recordings where the language can also change within one recording. Therefore, the English STD system uses an Out-Of-Language (OOL) detection module to filter out non-English input segments. OOL detection is evaluated w.r.t. various confidence measures estimated from word lattices. Experimental studies of OOL detection followed by English STD are performed on several hours of multilingual recordings. Significant improvement of OOL+STD over a stand-alone STD system is achieved (relatively more than 50% in EER). Finally, an additional modality (text slides in the form of PowerPoint presentations) is exploited to improve STD.