logo Idiap Research Institute        
 [BibTeX] [Marc21]
Towards directly modeling raw speech signal for speaker verification using CNNs
Type of publication: Conference paper
Citation: Muckenhirn_ICASSP_2018
Publication status: Published
Booktitle: IEEE International Conference on Acoustics, Speech and Signal Processing
Year: 2018
Month: April
Pages: 4884-4888
Location: Calgary, CANADA
ISBN: 978-1-5386-4658-8
Crossref: Muckenhirn_Idiap-RR-30-2017:
Abstract: Speaker verification systems traditionally extract and model cepstral features or filter bank energies from the speech signal. In this paper, inspired by the success of neural network-based approaches to model directly raw speech signal for applications such as speech recognition, emotion recognition and anti-spoofing, we propose a speaker verification approach where speaker discriminative information is directly learned from the speech signal by: (a) first training a CNN-based speaker identification system that takes as input raw speech signal and learns to classify on speakers (unknown to the speaker verification system); and then (b) building a speaker detector for each speaker in the speaker verification system by replacing the output layer of the speaker identification system by two outputs (genuine, impostor), and adapting the system in a discriminative manner with enrollment speech of the speaker and impostor speech data. Our investigations on the Voxforge database shows that this approach can yield systems competitive to state-of-the-art systems. An analysis of the filters in the first convolution layer shows that the filters give emphasis to information in low frequency regions (below 1000 Hz) and implicitly learn to model fundamental frequency information in the speech signal for speaker discrimination.
Keywords: Convolutional neural network, End-to-end learning, Fundamental frequency, recognition, speaker verification
Projects Idiap
UNITS
Authors Muckenhirn, Hannah
Magimai.-Doss, Mathew
Marcel, Sébastien
Added by: [UNK]
Total mark: 0
Attachments
  • Muckenhirn_ICASSP_2018.pdf
Notes