Evaluating defensive solutions against adversarial evasion attacks means quantifying the defense' s capability to detect or tolerate attacks. Ideally, a defense should be tested against all the possible attacks: however, this is not achievable, and it is necessary to identify a representative attack set for the evaluation. Unfortunately, how to select such an attack set is an open question. Arguably, the selected a ttacks s hould a pply d iverse effects on the original image, interms of dimension and distribution of the perturbation. We propose to quantify the perturbation through Image Quality metrics in addition to L-norms, such that adversarial attacks can be grouped (and only one representative of the group can be selected to test the defense) if they i) similarly perturb the attacked image, and ii) have similar success rate and detectability rate. Disappointingly, the analysis reveals that attacks with similar image perturbation cannot be related. Substantial evidence discourages grouping attacks and suggests that any reduction of the attack set impacts the validity of the defense evaluation.
On Attacks (Dis)Similarities to Test Adversarial Defense: Can We Reduce the Attack Set? / Puccetti T.; Zoppi T.; Ceccarelli A.. - ELETTRONICO. - 3731:(2024), pp. 0-0. (Intervento presentato al convegno 8th Italian Conference on Cyber Security, ITASEC 2024 tenutosi a ita nel 2024).
On Attacks (Dis)Similarities to Test Adversarial Defense: Can We Reduce the Attack Set?
Puccetti T.;Zoppi T.;Ceccarelli A.
2024
Abstract
Evaluating defensive solutions against adversarial evasion attacks means quantifying the defense' s capability to detect or tolerate attacks. Ideally, a defense should be tested against all the possible attacks: however, this is not achievable, and it is necessary to identify a representative attack set for the evaluation. Unfortunately, how to select such an attack set is an open question. Arguably, the selected a ttacks s hould a pply d iverse effects on the original image, interms of dimension and distribution of the perturbation. We propose to quantify the perturbation through Image Quality metrics in addition to L-norms, such that adversarial attacks can be grouped (and only one representative of the group can be selected to test the defense) if they i) similarly perturb the attacked image, and ii) have similar success rate and detectability rate. Disappointingly, the analysis reveals that attacks with similar image perturbation cannot be related. Substantial evidence discourages grouping attacks and suggests that any reduction of the attack set impacts the validity of the defense evaluation.File | Dimensione | Formato | |
---|---|---|---|
paper23.pdf
Accesso chiuso
Tipologia:
Pdf editoriale (Version of record)
Licenza:
Solo lettura
Dimensione
783.12 kB
Formato
Adobe PDF
|
783.12 kB | Adobe PDF | Richiedi una copia |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.