Joint Detection and Localization of Multiple Speakers using a Probabilistic Interpretation of the Steered Response Power
Type of publication: Conference paper
Citation: Oualil_SAPA-SCALE_2012
Publication status: Accepted
Booktitle: Statistical and Perceptual Audition Workshop
Year: 2012
Month: September
Abstract: Detection and localization of multiple speakers in a noisy and reverberant environment is a fundamental and difficult task. In the literature, steered response power (SRP) based techniques are typically used to accomplish this task which can be computationally intensive. Nonetheless, the localization of multiple speakers remains a challenging in practice. In this paper, we present a novel approach based on a probabilistic interpretation of the SRP. The proposed method replaces the discrete search techniques by proposing an approximate analytical form of the SRP, which can adequately detect and localize multiple speakers. In addition to reliable detection and localization, the potential advantage of this approach is that it provides a probability density function (pdf) of the individual speaker positions rather than point estimates. Experiments on the AV16.3 corpus show the efficacy of the proposed approach.
Keywords: Gaussian mixture, Multiple speaker localization, Steered response power
