Federated and Generative Data Sharing for Data-Driven Security: Challenges and Approach

Natella, Roberto; Ceccarelli, Andrea; Ficco, Massimo

doi:10.1109/DASC/PiCom/CBDCom/Cy55231.2022.9927754

Modern cyber-attacks are evolving into Advanced Persistent Threats (APTs). They are attacks orchestrated by cybercriminals or state-sponsored groups, which perform carefullyplanned, stealthy, targeted attacks that span over a long period of time. It is difficult to defend against APTs, mostly because the absence of high-quality data to build detectors and train personnel. In fact, new attacks are continuously crafted, and most organizations are unwilling to share data about attacks they have experienced. In this paper, we argue about an approach for the automatic generation of representative datasets of APTs, without forcing organizations to disclose their sensitive information. We propose to adopt the Federated Learning paradigm to train a Generative Machine Learning model, which will generate new traces of network and host events representative of real APT attacks. Blockchain-based strategies will overcome the typical shortcomings of a centralized approach, such as single-pointfailure and malicious clients. The generated APT datasets can be leveraged for training and assessing APT detectors based on AI, and emulating attacks in live cyber-ranges exercises.

Federated and Generative Data Sharing for Data-Driven Security: Challenges and Approach / Roberto Natella, Andrea Ceccarelli, Massimo Ficco. - ELETTRONICO. - (2022), pp. 0-0. (Intervento presentato al convegno 20th International Workshop on Assurance in Distributed Systems and Networks (ADSN 2022)) [10.1109/DASC/PiCom/CBDCom/Cy55231.2022.9927754].