Evaluating the Robustness of Privacy-Sensitive Audio Features for Speech Detection in Personal Audio Log Scenarios

We use cookies

This website uses cookies and other tracking technologies to improve your browsing experience for the following purposes: to enable basic functionality of the website, to provide a better experience on the website, to measure your interest in our products and services and to personalize marketing interactions, to deliver ads that are more relevant to you.

[BibTeX] [Marc21]

Type of publication:	Conference paper
Citation:	Parthasarathi_PROCEEDINGSOFICASSP2010_2010
Booktitle:	ICASSP 2010
Year:	2010
Abstract:	Personal audio logs are often recorded in multiple environments. This poses challenges for robust front-end processing, including speech/nonspeech detection (SND). Motivated by this, we investigate the robustness of four different privacy-sensitive features for SND, namely energy, zero crossing rate, spectral flatness, and kurtosis. We study early and late fusion of these features in conjunction with modeling temporal context. These combinations are evaluated in mismatched conditions on a dataset of nearly 450 hours. While both combinations yield improvements over individual features, generally feature combinations perform better. Comparisons with a state-of-the-art spectral based and a privacy-sensitive feature set are also provided.
Keywords:
Projects	Idiap SNSF-MULTI
Authors	Parthasarathi, Sree Hari Krishnan Magimai-Doss, Mathew Bourlard, Hervé Gatica-Perez, Daniel
Added by:	[UNK]
Total mark:	0
Attachments
Parthasarathi_PROCEEDINGSOFICASSP2010_2010.pdf
Notes

processing time: 0.0008 seconds.