Robust and Discriminative Speaker Embedding via Intra-Class Distance Variance Regularization
| Type of publication: | Conference paper |
| Citation: | Le_INTERSPEECH2018_2018 |
| Publication status: | Published |
| Booktitle: | Proceedings of Interspeech |
| Year: | 2018 |
| Pages: | 2257-2261 |
| Location: | Hyderabad, INDIA |
| ISSN: | 2308-457X |
| ISBN: | 978-1-5108-7221-9 |
| DOI: | 10.21437/Interspeech.2018-1685 |
| Abstract: | Learning a good speaker embedding is critical for many speech processing tasks, including recognition, verification, and diarization. To this end, we propose a complementary optimizing goal called intra-class loss to improve deep speaker embed dings learned with triplet loss. This loss function is formulated as a soft constraint on the averaged pair-wise distance between samples from the same class. Its goal is to prevent the scattering of these samples within the embedding space to increase the intra-class compactncss.When intra-class loss is jointly optimized with triplet loss, we can observe 2 major improvements: the deep embedding network can achieve a more robust and discriminative representation and the training process is more stable with a faster convergence rate. We conduct experiments on 2 large public benchmarking datasets for speaker verification, VoxCeleb and VoxForge. The results show that intra-class loss helps accelerating the convergence of deep network training and significantly improves the overall performance of the resulted embeddings. |
| Keywords: | deep neural networks, embedding learning, speaker verification, triplet loss |
| Projects: |
Idiap EUMSSI MUMMER |
| Authors: | |
| Added by: | [UNK] |
| Total mark: | 0 |
|
Attachments
|
|
|
Notes
|
|
|
|
|