کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
494132 860944 2007 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
TFRP: An efficient microaggregation algorithm for statistical disclosure control
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
TFRP: An efficient microaggregation algorithm for statistical disclosure control
چکیده انگلیسی

Recently, the issue of statistic disclosure control (SDC) has attracted much attention. SDC is a very important part of data security dealing with the protection of databases. Microaggregation for SDC techniques is widely used to protect confidentiality in statistical databases released for public use. The basic problem of microaggregation is that similar records are clustered into groups, and each group contains at least k records to prevent disclosure of individual information, where k is a pre-defined security threshold. For a certain k, an optimal multivariable microaggregation has the lowest information loss. The minimum information loss is an NP-hard problem. Existing fixed-size techniques can obtain a low information loss with O(n2) or O(n3/k) time complexity. To improve the execution time and lower information loss, this study proposes the Two Fixed Reference Points (TFRP) method, a two-phase algorithm for microaggregation. In the first phase, TFRP employs the pre-computing and median-of-medians techniques to efficiently shorten its running time to O(n2/k). To decrease information loss in the second phase, TFRP generates variable-size groups by removing the lower homogenous groups. Experimental results reveal that the proposed method is significantly faster than the Diameter and the Centroid methods. Running on several test datasets, TFRP also significantly reduces information loss, particularly in sparse datasets with a large k.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Systems and Software - Volume 80, Issue 11, November 2007, Pages 1866–1878
نویسندگان
, , ,