Hybrid microdata using microaggregation

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
394178	665782	2010	11 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Microaggregation Privacy-preserving data mining - حفاظت از حریم خصوصی داده کاوی Hybrid data - داده های ترکیبی Synthetic data - داده های مصنوعی Statistical disclosure control - کنترل افشای اطلاعات

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

چکیده انگلیسی

Statistical disclosure control (also known as privacy-preserving data mining) of microdata is about releasing data sets containing the answers of individual respondents protected in such a way that: (i) the respondents corresponding to the released records cannot be re-identified; (ii) the released data stay analytically useful. Usually, the protected data set is generated by either masking (i.e. perturbing) the original data or by generating synthetic (i.e. simulated) data preserving some pre-selected statistics of the original data. Masked data may approximately preserve a broad range of distributional characteristics, although very few of them (if any) are exactly preserved; on the other hand, synthetic data exactly preserve the pre-selected statistics and may seem less disclosive than masked data, but they do not preserve at all any statistics other than those pre-selected. Hybrid data obtained by mixing the original data and synthetic data have been proposed in the literature to combine the strengths of masked and synthetic data. We show how to easily obtain hybrid data by combining microaggregation with any synthetic data generator. We show that numerical hybrid data exactly preserving means and covariances of original data and approximately preserving other statistics as well as some subdomain analyses can be obtained as a particular case with a very simple parameterization. The new method is competitive versus both the literature on hybrid data and plain multivariate microaggregation.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 180, Issue 15, 1 August 2010, Pages 2834–2844

نویسندگان

Josep Domingo-Ferrer, Úrsula González-Nicolás,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Hybrid microdata using microaggregation

دسترسی سریع

ارتباط

English Website