Machine learning (ML) has emerged as a compelling approach to identify attacks in network traffic security. Existing malware detection strategies often concentrate on specific facets, such as efficient data collection, particular types of malware, or handling data scarcity. While valid, these strategies typically overlook the potential for minimizing sample size, focusing instead on data augmentation. This work introduces a novel method to determine the minimum sample size necessary to achieve a specified accuracy level, measured by the F1 score derived from the confusion matrix. We focus on TCP header traffic data transformed into images through flow-splitting techniques for multi-class traffic classification. In addition, we introduce a diffusion model to generate new synthetic traffic images and show that our method outperforms existing techniques in terms of stability and predictability. This study also compares the effectiveness of synthetic image augmentation using Generative Adversarial Networks (GANs) and Denoising Diffusion Probabilistic Models (DDPM) in improving image recognition and classification accuracy.
Addressing Data Security in IoT: Minimum Sample Size and Denoising Diffusion Models for Improved Malware Detection / Camerota, Chiara; Pappone, Lorenzo; Pecorella, Tommaso; Esposito, Flavio. - ELETTRONICO. - (2024), pp. 1-7. (Intervento presentato al convegno 20th International Conference on Network and Service Management (CNSM) tenutosi a Pague nel 28 - 31 October 2024) [10.23919/cnsm62983.2024.10814607].
Addressing Data Security in IoT: Minimum Sample Size and Denoising Diffusion Models for Improved Malware Detection
Camerota, Chiara
Membro del Collaboration Group
;Pecorella, Tommaso
Membro del Collaboration Group
;
2024
Abstract
Machine learning (ML) has emerged as a compelling approach to identify attacks in network traffic security. Existing malware detection strategies often concentrate on specific facets, such as efficient data collection, particular types of malware, or handling data scarcity. While valid, these strategies typically overlook the potential for minimizing sample size, focusing instead on data augmentation. This work introduces a novel method to determine the minimum sample size necessary to achieve a specified accuracy level, measured by the F1 score derived from the confusion matrix. We focus on TCP header traffic data transformed into images through flow-splitting techniques for multi-class traffic classification. In addition, we introduce a diffusion model to generate new synthetic traffic images and show that our method outperforms existing techniques in terms of stability and predictability. This study also compares the effectiveness of synthetic image augmentation using Generative Adversarial Networks (GANs) and Denoising Diffusion Probabilistic Models (DDPM) in improving image recognition and classification accuracy.File | Dimensione | Formato | |
---|---|---|---|
Addressing_Data_Security_in_IoT_Minimum_Sample_Size_and_Denoising_Diffusion_Models_for_Improved_Malware_Detection.pdf
accesso aperto
Tipologia:
Pdf editoriale (Version of record)
Licenza:
Tutti i diritti riservati
Dimensione
6.49 MB
Formato
Adobe PDF
|
6.49 MB | Adobe PDF |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.