کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
402506 676953 2016 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Hierarchical anonymization algorithms against background knowledge attack in data releasing
ترجمه فارسی عنوان
الگوریتم های شناسایی سلسله مراتبی علیه حمله دانش پس زمینه در انتشار داده ها
کلمات کلیدی
حفاظت از حریم خصوصی، داده های جدول الگوریتم شناسایی سلسله مراتبی، دانش پیشین، معیار از دست دادن اطلاعات
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• We define a privacy model based on k-anonymity and one of its strong refinements to prevent the background knowledge attack.
• We propose two hierarchical anonymization algorithm to satisfy our privacy model.
• Our algorithms outperform the state-of the art anonymization algorithm in terms of utility and privacy.
• We extend an information loss measure to capture data inaccuracies caused by not-fitted records in any equivalence class.

Preserving privacy in the presence of adversary’s background knowledge is very important in data publishing. The k-anonymity model, while protecting identity, does not protect against attribute disclosure. One of strong refinements of k-anonymity, β-likeness, does not protect against identity disclosure. Neither model protects against attacks featured by background knowledge. This research proposes two approaches for generating k-anonymous β-likeness datasets that protect against identity and attribute disclosures and prevent attacks featured by any data correlations between QIs and sensitive attribute values as the adversary’s background knowledge. In particular, two hierarchical anonymization algorithms are proposed. Both algorithms apply agglomerative clustering techniques in their first stage in order to generate clusters of records whose probability distributions extracted by background knowledge are similar. In the next phase, k-anonymity and β-likeness are enforced in order to prevent identity and attribute disclosures. Our extensive experiments demonstrate that the proposed algorithms outperform other state-of-the-art anonymization algorithms in terms of privacy and data utility where the number of unpublished records in our algorithms is less than that of the others. As well-known information loss metrics fail to measure precisely the imposed data inaccuracies stemmed from the removal of records that cannot be published in any equivalence class. This research also introduces an extension into the Global Certainty Penalty metric that considers unpublished records.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 101, 1 June 2016, Pages 71–89
نویسندگان
, , , ,