کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
379076 659260 2009 24 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Privacy-preserving data publishing for cluster analysis
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Privacy-preserving data publishing for cluster analysis
چکیده انگلیسی

Releasing person-specific data could potentially reveal sensitive information about individuals. k-anonymization is a promising privacy protection mechanism in data publishing. Although substantial research has been conducted on k-anonymization and its extensions in recent years, only a few prior works have considered releasing data for some specific purpose of data analysis. This paper presents a practical data publishing framework for generating a masked version of data that preserves both individual privacy and information usefulness for cluster analysis. Experiments on real-life data suggest that by focusing on preserving cluster structure in the masking process, the cluster quality is significantly better than the cluster quality of the masked data without such focus. The major challenge of masking data for cluster analysis is the lack of class labels that could be used to guide the masking process. Our approach converts the problem into the counterpart problem for classification analysis, wherein class labels encode the cluster structure in the data, and presents a framework to evaluate the cluster quality on the masked data.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Data & Knowledge Engineering - Volume 68, Issue 6, June 2009, Pages 552–575
نویسندگان
, , , ,