Deep Neural Networks for Multiple Speaker Detection and Localization
Type of publication: | Conference paper |
Citation: | He_ICRA_2018 |
Publication status: | Published |
Booktitle: | 2018 IEEE International Conference on Robotics and Automation (ICRA) |
Year: | 2018 |
Month: | May |
Pages: | 74-79 |
Location: | Brisbane, AUSTRALIA |
ISSN: | 1050-4729 |
ISBN: | 978-1-5386-3081-5 |
Crossref: | He_Idiap-RR-02-2018: |
DOI: | 10.1109/ICRA.2018.8461267 |
Abstract: | We propose to use neural networks for simultaneous detection and localization of multiple sound sources in human-robot interaction. In contrast to conventional signal processing techniques, neural network-based sound source localization methods require fewer strong assumptions about the environment. Previous neural network-based methods have been focusing on localizing a single sound source, which do not extend to multiple sources in terms of detection and localization. In this paper, we thus propose a likelihood-based encoding of the network output, which naturally allows the detection of an arbitrary number of sources. In addition, we investigate the use of sub-band cross-correlation information as features for better localization in sound mixtures, as well as three different network architectures based on different motivations. Experiments on real data recorded from a robot show that our proposed methods significantly outperform the popular spatial spectrum-based approaches. |
Keywords: | acoustic generators, Artificial Neural Networks, deep neural networks, Delays, Encoding, Estimation, human-robot interaction, likelihood-based encoding, microphone arrays, Microphones, multiple sound sources, multiple speaker detection, network output, neural nets, neural network-based sound source localization methods, Robots, simultaneous detection, single sound source, sound mixtures, spatial spectrum-based approaches, speaker recognition |
Projects |
Idiap MUMMER |
Authors | |
Added by: | [UNK] |
Total mark: | 0 |
Attachments
|
|
Notes
|
|
|