Evaluating defensive solutions against adversarial evasion attacks means quantifying the defense' s capability to detect or tolerate attacks. Ideally, a defense should be tested against all the possible attacks: however, this is not achievable, and it is necessary to identify a representative attack set for the evaluation. Unfortunately, how to select such an attack set is an open question. Arguably, the selected a ttacks s hould a pply d iverse effects on the original image, interms of dimension and distribution of the perturbation. We propose to quantify the perturbation through Image Quality metrics in addition to L-norms, such that adversarial attacks can be grouped (and only one representative of the group can be selected to test the defense) if they i) similarly perturb the attacked image, and ii) have similar success rate and detectability rate. Disappointingly, the analysis reveals that attacks with similar image perturbation cannot be related. Substantial evidence discourages grouping attacks and suggests that any reduction of the attack set impacts the validity of the defense evaluation.

On Attacks (Dis)Similarities to Test Adversarial Defense: Can We Reduce the Attack Set? / Puccetti T.; Zoppi T.; Ceccarelli A.. - ELETTRONICO. - 3731:(2024), pp. 0-0. (Intervento presentato al convegno 8th Italian Conference on Cyber Security, ITASEC 2024 tenutosi a ita nel 2024).

On Attacks (Dis)Similarities to Test Adversarial Defense: Can We Reduce the Attack Set?

Puccetti T.;Zoppi T.;Ceccarelli A.
2024

Abstract

Evaluating defensive solutions against adversarial evasion attacks means quantifying the defense' s capability to detect or tolerate attacks. Ideally, a defense should be tested against all the possible attacks: however, this is not achievable, and it is necessary to identify a representative attack set for the evaluation. Unfortunately, how to select such an attack set is an open question. Arguably, the selected a ttacks s hould a pply d iverse effects on the original image, interms of dimension and distribution of the perturbation. We propose to quantify the perturbation through Image Quality metrics in addition to L-norms, such that adversarial attacks can be grouped (and only one representative of the group can be selected to test the defense) if they i) similarly perturb the attacked image, and ii) have similar success rate and detectability rate. Disappointingly, the analysis reveals that attacks with similar image perturbation cannot be related. Substantial evidence discourages grouping attacks and suggests that any reduction of the attack set impacts the validity of the defense evaluation.
2024
CEUR Workshop Proceedings
8th Italian Conference on Cyber Security, ITASEC 2024
ita
2024
Puccetti T.; Zoppi T.; Ceccarelli A.
File in questo prodotto:
File Dimensione Formato  
paper23.pdf

Accesso chiuso

Tipologia: Pdf editoriale (Version of record)
Licenza: Solo lettura
Dimensione 783.12 kB
Formato Adobe PDF
783.12 kB Adobe PDF   Richiedi una copia

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1385152
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact