Extended k-anonymity models against sensitive attribute disclosure

Article ID	Journal	Published Year	Pages	File Type
450128	Computer Communications	2011	10 Pages	PDF

Abstract

p-Sensitive k-anonymity model has been recently defined as a sophistication of k-anonymity. This new property requires that there be at least p distinct values for each sensitive attribute within the records sharing a set of quasi-identifier attributes. In this paper, we identify the situations when the p-sensitive k-anonymity property is not enough for the sensitive attributes protection. To overcome the shortcoming of the p-sensitive k -anonymity principle, we propose two new enhanced privacy requirements, namely p+p+-sensitive k -anonymity and (p,α)(p,α)-sensitive k -anonymity properties. These two new introduced models target at different perspectives. Instead of focusing on the specific values of sensitive attributes, p+p+-sensitive k -anonymity model concerns more about the categories that the values belong to. Although (p,α)(p,α)-sensitive k -anonymity model still put the point on the specific values, it includes an ordinal metric system to measure how much the specific sensitive attribute values contribute to each QI-group. We make a thorough theoretical analysis of hardness in computing the data set that satisfies either p+p+-sensitive k -anonymity or (p,α)(p,α)-sensitive k-anonymity. We devise a set of algorithms using the idea of top-down specification, which is clearly illustrated in the paper. We implement our algorithms on two real-world data sets and show in the comprehensive experimental evaluations that the two new introduced models are superior to the previous method in terms of effectiveness and efficiency.

Keywords

NP-hard k-anonymity Experiments Algorithm