Robust and Discriminative Speaker Embedding via Intra-Class Distance Variance Regularization

We use cookies

This website uses cookies and other tracking technologies to improve your browsing experience for the following purposes: to enable basic functionality of the website, to provide a better experience on the website, to measure your interest in our products and services and to personalize marketing interactions, to deliver ads that are more relevant to you.

[BibTeX] [Marc21]

Type of publication:	Conference paper
Citation:	Le_INTERSPEECH2018_2018
Publication status:	Published
Booktitle:	Proceedings of Interspeech
Year:	2018
Pages:	2257-2261
Location:	Hyderabad, INDIA
ISSN:	2308-457X
ISBN:	978-1-5108-7221-9
DOI:	10.21437/Interspeech.2018-1685
Abstract:	Learning a good speaker embedding is critical for many speech processing tasks, including recognition, verification, and diarization. To this end, we propose a complementary optimizing goal called intra-class loss to improve deep speaker embed dings learned with triplet loss. This loss function is formulated as a soft constraint on the averaged pair-wise distance between samples from the same class. Its goal is to prevent the scattering of these samples within the embedding space to increase the intra-class compactncss.When intra-class loss is jointly optimized with triplet loss, we can observe 2 major improvements: the deep embedding network can achieve a more robust and discriminative representation and the training process is more stable with a faster convergence rate. We conduct experiments on 2 large public benchmarking datasets for speaker verification, VoxCeleb and VoxForge. The results show that intra-class loss helps accelerating the convergence of deep network training and significantly improves the overall performance of the resulted embeddings.
Keywords:	deep neural networks, embedding learning, speaker verification, triplet loss
Projects	Idiap EUMSSI MUMMER
Authors	Le, Nam Odobez, Jean-Marc
Added by:	[UNK]
Total mark:	0
Attachments
Le_INTERSPEECH2018_2018.pdf
Notes

processing time: 0.0009 seconds.