Long Term Spectral Statistics for Voice Presentation Attack Detection

Type of publication:	Idiap-RR
Citation:	Muckenhirn_Idiap-RR-11-2017
Number:	Idiap-RR-11-2017
Year:	2017
Month:	3
Institution:	Idiap
Abstract:	Automatic speaker verification systems can be spoofed through recorded, synthetic or voice converted speech of target speakers. To make these systems practically viable, the detection of such attacks, referred to as presentation attacks, is of paramount interest. In that direction, this paper investigates two aspects: (a) a novel approach to detect presentation attacks where, unlike conventional approaches, no speech signal related assumptions are made, rather the attacks are detected by computing first order and second order spectral statistics and feeding them to a classifier, and (b) generalization of the presentation attack detection systems across databases. Our investigations on Interspeech 2015 ASVspoof challenge dataset and AVspoof dataset show that, when compared to the approaches based on conventional short-term spectral processing, the proposed approach with a linear discriminative classifier yields a better system, irrespective of whether the spoofed signal is replayed to the microphone or is directly injected into the system software process. Cross-database investigations show that neither the short-term spectral processing based approaches nor the proposed approach yield systems which are able to generalize across databases or methods of attack. Thus, revealing the difficulty of the problem and the need for further resources and research.
Keywords:
Projects:	Idiap SWAN UNITS
Authors:	Muckenhirn, Hannah Korshunov, Pavel Magimai-Doss, Mathew Marcel, Sébastien
Crossref by	Muckenhirn_TASLP_2017
Added by:	[ADM]
Total mark:	0
Attachments
Muckenhirn_Idiap-RR-11-2017.pdf (MD5: b6d817c1c1499e7dd2234780b82d6afe)
Notes

processing time: 0.0004 seconds.