Modeling Source and System characteristics using Zero Frequency Filtering for Voice Activity Detection
Type of publication: | Idiap-Internal-RR |
Citation: | Sarkar_Idiap-Internal-RR-80-2021 |
Number: | Idiap-Internal-RR-80-2021 |
Year: | 2021 |
Month: | October |
Institution: | Idiap |
Address: | Rue Marconi 19, 1920 Martigny, CH |
Note: | Submitted to ICASSP 2022 |
Abstract: | Voice activity detection (VAD) is an important pre–processing step for several speech applications. The task requires demarcation of boundaries for segments with voicing information. Several methods in literature perform VAD based on extraction of spectral and temporal information derived across overlapping segments in speech. There are, however, limitations with each approach owing to the underlying decision criterion and thresholding. The present paper proposes a time–domain signal–processing based approach to derive knowledge–based speech specific characteristics for the purpose of VAD. Specifically, we show that extracting source and system characteristics in a framework based on the zero– frequency filtering (ZFF) method, helps in robust identification of segment boundaries with voicing information for a given audio signal. The proposed method is compared with other signal processing based methods which highlight speech specific spectral information. The methods are evaluated for speech segments obtained from TIMIT and MUSAN database, across a range of SNRs (40–0 dB). The analysis shows that the proposed ZFF-based method is robust to degradation and exhibits clear advantage over other spectral information based methods. |
Keywords: | |
Projects |
Idiap |
Authors | |
Added by: | [UNK] |
Total mark: | 0 |
Attachments
|
|
Notes
|
|
|