An Integrated Framework for Multi-Channel Multi-Source Localization and Voice Activity Detection
| Type of publication: | Conference paper |
| Citation: | Taghizadeh_HSCMA_2011 |
| Publication status: | Published |
| Booktitle: | The Third Joint Workshop on Hands-free Speech Communication and Microphone Arrays |
| Year: | 2011 |
| Crossref: | Taghizadeh_Idiap-RR-16-2011: |
| Abstract: | Two of the major challenges in microphone array based adaptive beamforming, speech enhancement and distant speech recognition, are robust and accurate source localization and voice activity detection. This paper introduces a spatial gradient steered response power using the phase transform (SRP-PHAT) method which is capable of localization of competing speakers in overlapping conditions. We further investigate the behaviour of the SRP function and characterize theoretically a fixed point in its search space for the diffuse noise field. We call this fixed point the null position in the SRP search space. Building on this evidence, we propose a technique for multi- channel voice activity detection (MVAD) based on detection of a maximum power corresponding to the null position. The gradient SRP-PHAT in tandem with the MVAD form an integrated framework of multi-source localization and voice activity detection. The experiments carried out on real data recordings show that this framework is very effective in practical applications of hands-free communication. |
| Keywords: | |
| Projects: |
Idiap IM2 |
| Authors: | |
| Added by: | [UNK] |
| Total mark: | 0 |
|
Attachments
|
|
|
Notes
|
|
|
|
|