logo Idiap Research Institute        
 [BibTeX] [Marc21]
AN INTEGRATED FRAMEWORK FOR MULTI-CHANNEL MULTI-SOURCE LOCALIZATION AND VOICE ACTIVITY DETECTION
Type of publication: Idiap-RR
Citation: Taghizadeh_Idiap-RR-16-2011
Number: Idiap-RR-16-2011
Year: 2011
Month: 6
Institution: Idiap
Abstract: Two of the major challenges in microphone array based adap- tive beamforming, speech enhancement and distant speech recognition, are robust and accurate source localization and voice activity detection. This paper introduces a spatial gra- dient steered response power using the phase transform (SRP- PHAT) method which is capable of localization of competing speakers in overlapping conditions. We further investigate the behavior of the SRP function and characterize theoretically a fixed point in its search space for the diffuse noise field. We call this fixed point the null position in the SRP search space. Building on this evidence, we propose a technique for multi- channel voice activity detection (MVAD) based on detection of a maximum power corresponding to the null position. The gradient SRP-PHAT in tandem with the MVAD form an inte- grated framework of multi-source localization and voice ac- tivity detection. The experiments carried out on real data recordings show that this framework is very effective in prac- tical applications of hands-free communication.
Keywords:
Projects Idiap
IM2
Authors Taghizadeh, Mohammad J.
Garner, Philip N.
Bourlard, Hervé
Abutalebi, Hamid Reza
Asaei, Afsaneh
Crossref by Taghizadeh_HSCMA_2011
Added by: [ADM]
Total mark: 0
Attachments
  • Taghizadeh_Idiap-RR-16-2011.pdf (MD5: 2b6226260325cb3438c8767d814dc241)
Notes