Exploiting Contextual Information for Speech/Non-Speech Detection
Type of publication: | Conference paper |
Citation: | Parthasarathi_TSD2008_2008 |
Booktitle: | Text, Speech and Dialogue |
Series: | Series of Lecture Notes In Artificial Intelligence (LNAI) |
Volume: | 5246 |
Year: | 2008 |
Month: | 9 |
Publisher: | Springer-Verlag Berlin, Heidelberg |
Location: | Brno, Czech Republic |
ISBN: | 978-3-540-87390-7 |
Crossref: | parthasarathi:rr08-22: |
Abstract: | In this paper, we investigate the effect of temporal context for speech/non-speech detection (SND). It is shown that even a simple feature such as full-band energy, when employed with a large-enough context, shows promise for further investigation. Experimental evaluations on the test data set, with a state-of-the-art multi-layer perceptron based SND system and a simple energy threshold based SND method, using the F-measure, show an absolute performance gain of 4.4% and 5.4% respectively. The optimal contextual length was found to be 1000 ms. Further numerical optimizations yield an improvement (3.37% absolute,',','), resulting in an absolute gain of 7.77% and 8.77% over the MLP based and energy based methods respectively. ROC based performance evaluation also reveals promising performance for the proposed method, particularly in low SNR conditions. |
Keywords: | |
Projects |
Idiap |
Authors | |
Added by: | [UNK] |
Total mark: | 0 |
Attachments
|
|
Notes
|
|
|