A family of enhanced (L,α)(L,α)-diversity models for privacy preserving data publishing

Article ID	Journal	Published Year	Pages	File Type
424732	Future Generation Computer Systems	2011	9 Pages	PDF

Abstract

Privacy preservation is an important issue in the release of data for mining purposes. Recently, a novel ll-diversity privacy model was proposed. However, even an ll-diverse data set may have some severe problems leading to the revelation of individual sensitive information. In this paper, we remedy the problem by introducing distinct (l,α)(l,α)-diversity, which, intuitively, demands that the total weight of the sensitive values in a given QI-group is at least αα, where the weight is controlled by a pre-defined recursive metric system. We provide a thorough analysis of the distinct (l,α)(l,α)-diversity and prove that the optimal distinct (l,α)(l,α)-diversity problem with its two variants entropy (l,α)(l,α)-diversity and recursive (c,l,α)(c,l,α)-diversity are NP-hard, and propose a top-down anonymization approach to solve the distinct (l,α)(l,α)-diversity problem with its variants. We show in the extensive experimental evaluations that the proposed methods are practical in terms of utility measurements and can be implemented efficiently.

Research highlights► We propose a family of enhanced (lα)(lα)-diversity model to amend the current privacy principles.► We theoretically prove that computing the enhanced privacy models is NP-hard.► We provide a family of top-down specification methods for the introduced problems.► We conduct extensive experiments to verify the effectiveness and efficiency.