logo Idiap Research Institute        
 [BibTeX] [Marc21]
On the Use of Convolutional Neural Networks for Speech Presentation Attack Detection
Type of publication: Conference paper
Citation: Korshunov_ISBA_2018
Publication status: Published
Booktitle: International Conference on Identity, Security and Behavior Analysis
Year: 2018
Month: January
Abstract: Research in the area of automatic speaker verification (ASV) has advanced enough for the industry to start using ASV systems in practical applications. However, these systems are highly vulnerable to spoofing or presentation attacks (PAs), limiting their wide deployment. Several speech-based presentation attack detection (PAD) methods have been proposed recently but most of them are based on hand crafted frequency or phase-based features. Although convolutional neural networks (CNN) have already shown breakthrough results in face recognition, little is understood whether CNNs are as effective in detecting presentation attacks in speech. In this paper, to investigate the applicability of CNNs for PAD, we consider shallow and deep examples of CNN architectures implemented using Tensorflow and compare their performances with the state of the art MFCC with GMM-based system on two large databases with presentation attacks: publicly available voicePA and proprietary BioCPqD-PA. We study the impact of increasing the depth of CNNs on the performance, and note how they perform on unknown attacks, by using one database to train and another to evaluate. The results demonstrate that CNNs are able to learn a database significantly better (increasing depth also improves the performance), compared to hand crafted features. However, CNN-based PADs still lack the ability to generalize across databases and are unable to detect unknown attacks well.
Projects Idiap
Authors Korshunov, Pavel
Goncalves, Andreé R.
Violato, Ricardo P. V.
Simões, Flávio O.
Marcel, Sébastien
Added by: [UNK]
Total mark: 0
  • Korshunov_ISBA_2018.pdf