Black-box Attacks on Image Activity Prediction and its Natural Language Explanations

We use cookies

This website uses cookies and other tracking technologies to improve your browsing experience for the following purposes: to enable basic functionality of the website, to provide a better experience on the website, to measure your interest in our products and services and to personalize marketing interactions, to deliver ads that are more relevant to you.

[BibTeX] [Marc21]

Type of publication:	Conference paper
Citation:	Baia_ICCVW_AROW_2023
Publication status:	Published
Booktitle:	Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
Year:	2023
URL:	https://ieeexplore.ieee.org/do...
DOI:	10.1109/ICCVW60793.2023.00396
Abstract:	Explainable AI (XAI) methods aim to describe the decision process of deep neural networks. Early XAI methods produced visual explanations, whereas more recent techniques generate multimodal explanations that include textual information and visual representations. Visual XAI methods have been shown to be vulnerable to white-box and gray-box adversarial attacks, with an attacker having full or partial knowledge of and access to the target system. As the vulnerabilities of multimodal XAI models have not been examined, in this paper we assess for the first time the robustness to black-box attacks of the natural language explanations generated by a self-rationalizing image-based activity recognition model. We generate unrestricted, spatially variant perturbations that disrupt the association between the predictions and the corresponding explanations to mislead the model into generating unfaithful explanations. We show that we can create adversarial images that manipulate the explanations of an activity recognition model by having access only to its final output.
Keywords:	Activity Prediction, Adversarial Examples, Explainable AI, Textual Explanation
Projects	Idiap
Authors	Baia, Alina Elena Poggioni, Valentina Cavallaro, Andrea
Added by:	[UNK]
Total mark:	0
Attachments
Baia_ICCVW_AROW_2023.pdf
Notes

processing time: 0.0003 seconds.