Machine learning (ML) has emerged as a compelling approach to identify attacks in network traffic security. Existing malware detection strategies often concentrate on specific facets, such as efficient data collection, particular types of malware, or handling data scarcity. While valid, these strategies typically overlook the potential for minimizing sample size, focusing instead on data augmentation. This work introduces a novel method to determine the minimum sample size necessary to achieve a specified accuracy level, measured by the F1 score derived from the confusion matrix. We focus on TCP header traffic data transformed into images through flow-splitting techniques for multi-class traffic classification. In addition, we introduce a diffusion model to generate new synthetic traffic images and show that our method outperforms existing techniques in terms of stability and predictability. This study also compares the effectiveness of synthetic image augmentation using Generative Adversarial Networks (GANs) and Denoising Diffusion Probabilistic Models (DDPM) in improving image recognition and classification accuracy.

Addressing Data Security in IoT: Minimum Sample Size and Denoising Diffusion Models for Improved Malware Detection / Camerota, Chiara; Pappone, Lorenzo; Pecorella, Tommaso; Esposito, Flavio. - ELETTRONICO. - (2024), pp. 1-7. (Intervento presentato al convegno 20th International Conference on Network and Service Management (CNSM) tenutosi a Pague nel 28 - 31 October 2024) [10.23919/cnsm62983.2024.10814607].

Addressing Data Security in IoT: Minimum Sample Size and Denoising Diffusion Models for Improved Malware Detection

Camerota, Chiara
Membro del Collaboration Group
;
Pecorella, Tommaso
Membro del Collaboration Group
;
2024

Abstract

Machine learning (ML) has emerged as a compelling approach to identify attacks in network traffic security. Existing malware detection strategies often concentrate on specific facets, such as efficient data collection, particular types of malware, or handling data scarcity. While valid, these strategies typically overlook the potential for minimizing sample size, focusing instead on data augmentation. This work introduces a novel method to determine the minimum sample size necessary to achieve a specified accuracy level, measured by the F1 score derived from the confusion matrix. We focus on TCP header traffic data transformed into images through flow-splitting techniques for multi-class traffic classification. In addition, we introduce a diffusion model to generate new synthetic traffic images and show that our method outperforms existing techniques in terms of stability and predictability. This study also compares the effectiveness of synthetic image augmentation using Generative Adversarial Networks (GANs) and Denoising Diffusion Probabilistic Models (DDPM) in improving image recognition and classification accuracy.
2024
20th International Conference on Network and Service Management (CNSM)
20th International Conference on Network and Service Management (CNSM)
Pague
28 - 31 October 2024
Camerota, Chiara; Pappone, Lorenzo; Pecorella, Tommaso; Esposito, Flavio
File in questo prodotto:
File Dimensione Formato  
Addressing_Data_Security_in_IoT_Minimum_Sample_Size_and_Denoising_Diffusion_Models_for_Improved_Malware_Detection.pdf

accesso aperto

Tipologia: Pdf editoriale (Version of record)
Licenza: Tutti i diritti riservati
Dimensione 6.49 MB
Formato Adobe PDF
6.49 MB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1406133
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact