Detection of Adversarial Attacks by Observing Deep Features with Structured Data Algorithms

Puccetti, Tommaso; Ceccarelli, Andrea; Zoppi, Tommaso; Bondavalli, Andrea

doi:10.1145/3555776.3577629

Deep Neural Networks (DNNs) are highly vulnerable to adversarial attacks, which introduce human-imperceptible perturbations on the input to fool a DNN model. Detecting such attacks is fundamental to protect distributed applications that process input data using DNNs. Detection strategies typically rely on complex solutions that include modifications to the input and to the DNN model itself, and/or the deployment of a second DNN that suspects attacks. Despite these efforts, at the present stage of research, the ability to protect against adversarial attacks is unsatisfactory. Alternatively to most approaches, this paper proposes RISOTTO (adveRsarIal attackS detectiOn using strucTured daTa algOrithms), a very simple but effective and fast detection strategy that uses algorithms for structured data. RISOTTO does not modify the DNN model and its inputs and requires only the values of selected deep features at test time. Using the deep features of a single layer, the accuracy in detecting known attacks is 1.0 for the two MNIST models and all the selected attacks. Also, by combining the deep features of multiple layers, we show that our approach is competitive or better than state-of-the-art unsupervised solutions in detecting unknown attacks, especially for MNIST models with few deep features (below 1 million).

Detection of Adversarial Attacks by Observing Deep Features with Structured Data Algorithms / Tommaso Puccetti; Andrea Ceccarelli; Tommaso Zoppi; Andrea Bondavalli;. - ELETTRONICO. - (2023), pp. 125-134. (Intervento presentato al convegno 38th Annual ACM Symposium on Applied Computing, SAC 2023 tenutosi a est nel 2023) [10.1145/3555776.3577629].