کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6883805 1444207 2018 45 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Privacy-preserving mechanisms for k-modes clustering
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
Privacy-preserving mechanisms for k-modes clustering
چکیده انگلیسی
Clustering categorical data is an important data mining task with rich applications. As an extension of k-means applied to categorical data, the k-modes algorithm became a popular clustering tool due to its simplicity and efficiency. A lot of improvements for k-modes such as better initialization techniques or more effective dissimilarity scores have been proposed recently. However, the problem of running the k-modes in private manners is rarely considered. In this paper, we address the privacy-preserving k-modes problem using differential privacy, a formal and rigorous definition of privacy for data publication. Differential privacy guarantees that the existence of any item in the input dataset is indistinguishable by looking at the computation output. We analyze the challenges of differentially private k-modes with regard to the k-means counterpart. Then we propose several schemes in both interactive and non-interactive settings. We prove that our mechanisms satisfy differential privacy and run linearly in the number of data points. Evaluation over fifteen real datasets shows that we can achieve useful privacy-preserving clustering outputs. In terms of clustering cost, the interactive approaches perform better than the non-interactive schemes and the solution adapted from k-means.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Security - Volume 78, September 2018, Pages 60-75
نویسندگان
,