Speaker Inconsistency Detection in Tampered Video

We use cookies

This website uses cookies and other tracking technologies to improve your browsing experience for the following purposes: to enable basic functionality of the website, to provide a better experience on the website, to measure your interest in our products and services and to personalize marketing interactions, to deliver ads that are more relevant to you.

[BibTeX] [Marc21]

Type of publication:	Conference paper
Citation:	Korshunov_EUSIPCO_2018
Publication status:	Published
Booktitle:	European Signal Processing Conference
Year:	2018
Month:	September
Abstract:	With the increasing amount of video being consumed by people daily, there is a danger of the rise in maliciously modified video content (i.e., 'fake news') that could be used to damage innocent people or to impose a certain agenda, e.g., meddle in elections. In this paper, we consider audio manipulations in video of a person speaking to the camera. Such manipulation is easy to perform, for instance, one can just replace a part of audio, while it can dramatically change the message and the meaning of the video. With the goal to develop an automated system that can detect these audio-visual speaker inconsistencies, we consider several approaches proposed for lip-syncing and dubbing detection, based on convolutional and recurrent networks and compare them with systems that are based on more traditional classifiers. We evaluated these methods on publicly available databases VidTIMIT, AMI, and GRID, for which we generated sets of tampered data.
Keywords:	Benchmarking, lip-syncing, LSTM, Video tampering
Projects	Idiap SAVI
Authors	Korshunov, Pavel Marcel, Sébastien
Added by:	[UNK]
Total mark:	0
Attachments
Korshunov_EUSIPCO_2018.pdf
Notes

processing time: 0.0003 seconds.