pyannote.audio: neural building blocks for speaker diarization

We use cookies

This website uses cookies and other tracking technologies to improve your browsing experience for the following purposes: to enable basic functionality of the website, to provide a better experience on the website, to measure your interest in our products and services and to personalize marketing interactions, to deliver ads that are more relevant to you.

[BibTeX] [Marc21]

Type of publication:	Conference paper
Citation:	Bredin_ICASSP_2020
Publication status:	Published
Booktitle:	IEEE International Conference on Acoustics, Speech, and Signal Processing
Year:	2020
Month:	May
URL:	https://arxiv.org/pdf/1911.012...
Abstract:	We introduce pyannote.audio, an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it provides a set of trainable end-to-end neural building blocks that can be combined and jointly optimized to build speaker diarization pipelines. pyannote.audio also comes with pre-trained models covering a wide range of domains for voice activity detection, speaker change detection, overlapped speech detection, and speaker embedding – reaching state-of-the-art performance for most of them.
Keywords:
Projects	Idiap SWAN
Authors	Bredin, Herve Yin, Ruiqing Coria, Juan Manuel Korshunov, Pavel Lavechin, Marvin Fustes, Diego Titeux, Hadrien Bouaziz, Wassim Gill, Marie-Philippe
Added by:	[UNK]
Total mark:	0
Attachments

Notes

processing time: 0.0002 seconds.