Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6853886 | Data & Knowledge Engineering | 2018 | 26 Pages |
Abstract
In this paper, we address this challenge with a two-stage scheme called RG+RP: in the first stage, each participant perturbs his/her data by passing the data through a nonlinear function called repeated Gompertz (RG); in the second stage, he/she then projects his/her perturbed data to a lower dimension in an (almost) distance-preserving manner, using a specific random projection (RP) matrix. The nonlinear RG function is designed to mitigate maximum a posteriori (MAP) estimation attacks, while random projection resists independent component analysis (ICA) attacks and ensures clustering accuracy. The proposed two-stage randomisation scheme is assessed in terms of its recovery resistance to MAP estimation attacks. Preliminary theoretical analysis as well as experimental results on synthetic and real-world datasets indicate that RG+RP has better recovery resistance to MAP estimation attacks than most state-of-the-art techniques. For clustering, fuzzy c-means (FCM) is used. Results using seven cluster validity indices, root mean squared error (RMSE) and accuracy ratio show that clustering results based on two-stage-perturbed data are comparable to the clustering results based on raw data - this confirms the utility of our privacy-preserving scheme when used with either FCM or HCM.
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Lingjuan Lyu, James C. Bezdek, Yee Wei Law, Xuanli He, Marimuthu Palaniswami,