Impact of score fusion on voice biometrics and presentation attack detection in cross-database evaluations

Type of publication:	Journal paper
Citation:	Korshunov_STSP_2017
Publication status:	Published
Journal:	IEEE Journal of Selected Topics in Signal Processing
Volume:	11
Number:	4
Year:	2017
Month:	June
Pages:	695 - 705
DOI:	10.1109/JSTSP.2017.2692389
Abstract:	Research in the area of automatic speaker verification (ASV) has been advanced enough for the industry to start using ASV systems in practical applications. However, these systems are highly vulnerable to spoofing or presentation attacks, limiting their wide deployment. Therefore, it is important to develop mechanisms that can detect such attacks, and it is equally important for these mechanisms to be seamlessly integrated into existing ASV systems for practical and attack-resistant solutions. To be practical, however, an attack detection should (i) have high accuracy, (ii) be well-generalized for different attacks, and (iii) be simple and efficient. Several audio-based presentation attack detection (PAD) methods have been proposed recently but their evaluation was usually done on a single, often obscure, database with limited number of attacks. Therefore, in this paper, we conduct an extensive study of eight state-of-the-art PAD methods and evaluate their ability to detect known and unknown attacks (e.g., in a cross-database scenario) using two major publicly available speaker databases with spoofing attacks: AVspoof and ASVspoof. We investigate whether combining several PAD systems via score fusion can improve attack detection accuracy. We also study the impact of fusing PAD systems (via parallel and cascading schemes) with two i-vector and inter-session variability based ASV systems on the overall performance in both bona fide (no attacks) and spoof scenarios. The evaluation results question the efficiency and practicality of the existing PAD systems, especially when comparing results for individual databases and cross-database data. Fusing several PAD systems can lead to a slightly improved performance; however, how to select which systems to fuse remains an open question. Joint ASV-PAD systems show a significantly increased resistance to the attacks at the expense of slightly degraded performance for bona fide scenarios.
Keywords:	Presentation Attack Detection, score fusion, speaker database, speaker recognition, voice biometrics
Projects	Idiap SWAN
Authors	Korshunov, Pavel Marcel, Sébastien
Added by:	[UNK]
Total mark:	0
Attachments
Korshunov_STSP_2017.pdf
Notes

processing time: 0.0004 seconds.