We consider group-based anonymized tables, a popular approach to data publishing. This approach aims at protecting privacy of the involved individuals, by releasing an obfuscated version of the original data, where the exact correspondence between individuals and attribute values is hidden. When publishing data about individuals, one must typically balance the learner's utility against the risk posed by an attacker, potentially targeting individuals in the dataset. Accordingly, we propose a mcmc based methodology by which a data curator can simultaneously: (a) learn the population parameters from a given anonymized table, thus assessing its utility; (b) analyze the risk for any individual in the dataset to be linked to a specific sensitive value, beyond what can be inferred from the population parameters learned in (a), when the attacker has got to know the individual's nonsensitive attributes. We call this relative risk analysis. We propose a unified probabilistic model that encompasses both horizontal group based anonymization schemes, such as k-anonymity, and vertical ones, such as Anatomy. We detail the learning procedure for both the honest learner and the attacker. Based on the learned distributions, we put forward relative risk measures. Finally, we illustrate some experiments conducted with the proposed methodology on a real world dataset

Relative privacy risks and learning from anonymized data / Boreale, Michele; Corradi, Fabio. - ELETTRONICO. - (2017), pp. 0-0.

Relative privacy risks and learning from anonymized data

BOREALE, MICHELE;CORRADI, FABIO
2017

Abstract

We consider group-based anonymized tables, a popular approach to data publishing. This approach aims at protecting privacy of the involved individuals, by releasing an obfuscated version of the original data, where the exact correspondence between individuals and attribute values is hidden. When publishing data about individuals, one must typically balance the learner's utility against the risk posed by an attacker, potentially targeting individuals in the dataset. Accordingly, we propose a mcmc based methodology by which a data curator can simultaneously: (a) learn the population parameters from a given anonymized table, thus assessing its utility; (b) analyze the risk for any individual in the dataset to be linked to a specific sensitive value, beyond what can be inferred from the population parameters learned in (a), when the attacker has got to know the individual's nonsensitive attributes. We call this relative risk analysis. We propose a unified probabilistic model that encompasses both horizontal group based anonymization schemes, such as k-anonymity, and vertical ones, such as Anatomy. We detail the learning procedure for both the honest learner and the attacker. Based on the learned distributions, we put forward relative risk measures. Finally, we illustrate some experiments conducted with the proposed methodology on a real world dataset
2017
978-88-6453-521-0
SIS 2017. Statistics and Data Science: new challenges, new generations
0
0
Boreale, Michele; Corradi, Fabio
File in questo prodotto:
File Dimensione Formato  
3407_11724 [Pages 225 - 230].pdf

accesso aperto

Descrizione: Contributo nel volume dei proceddengs
Tipologia: Pdf editoriale (Version of record)
Licenza: Open Access
Dimensione 171.12 kB
Formato Adobe PDF
171.12 kB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1093474
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact