کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4950389 1440640 2017 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Privacy and utility preserving data clustering for data anonymization and distribution on Hadoop
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Privacy and utility preserving data clustering for data anonymization and distribution on Hadoop
چکیده انگلیسی
Data privacy is a stringent need when sharing and processing data on a distributed environment or in Internet of Things. Collaborative privacy-preserving data mining based on secured multiparty computation incur high communication and computational cost. Data anonymization is a promising technique in the field of privacy-preserving data mining used to protect the data against identity disclosure. Information loss and common attacks possible on the anonymized data are serious challenges of anonymization. Recently, data anonymization using data mining techniques has showed significant improvement in data utility. Still the existing techniques lack in effective handling of attacks. Hence in this paper, an anonymization algorithm based on clustering and resilient to similarity attack and probabilistic inference attack is proposed. The anonymized data is distributed on Hadoop Distributed File System. The method achieves a better trade-off between privacy and utility. In our work the data utility is measured in terms of accuracy and FMeasure with respect to different classifiers. Experiments show that the accuracy, FMeasure and the execution time of the classification algorithms on the privacy-preserved data sets formed by the proposed clustering algorithms are better than the existing algorithms.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 74, September 2017, Pages 393-408
نویسندگان
, ,