کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
518227 867566 2013 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A semantic framework to protect the privacy of electronic health records with non-numerical attributes
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
A semantic framework to protect the privacy of electronic health records with non-numerical attributes
چکیده انگلیسی

Structured patient data like Electronic Health Records (EHRs) are a valuable source for clinical research. However, the sensitive nature of such information requires some anonymisation procedure to be applied before releasing the data to third parties. Several studies have shown that the removal of identifying attributes, like the Social Security Number, is not enough to obtain an anonymous data file, since unique combinations of other attributes as for example, rare diagnoses and personalised treatments, may lead to patient’s identity disclosure. To tackle this problem, Statistical Disclosure Control (SDC) methods have been proposed to mask sensitive attributes while preserving, up to a certain degree, the utility of anonymised data. Most of these methods focus on continuous-scale numerical data. Considering that part of the clinical data found in EHRs is expressed with non-numerical attributes as for example, diagnoses, symptoms, procedures, etc., their application to EHRs produces far from optimal results. In this paper, we propose a general framework to enable the accurate application of SDC methods to non-numerical clinical data, with a focus on the preservation of semantics. To do so, we exploit structured medical knowledge bases like SNOMED CT to propose semantically-grounded operators to compare, aggregate and sort non-numerical terms. Our framework has been applied to several well-known SDC methods and evaluated using a real clinical dataset with non-numerical attributes. Results show that the exploitation of medical semantics produces anonymised datasets that better preserve the utility of EHRs.

Figure optionsDownload high-quality image (87 K)Download as PowerPoint slideHighlights
► A semantic framework is used to guarantee the privacy in non-numerical data of EHR.
► Ontology-based operators to compare, aggregate and sort medical terms are presented.
► Three statistical disclosure control methods apply this semantic framework.
► Evaluation is done using a real clinical dataset and SNOMED CT as ontology.
► Great improvement in the data utility is reported against a non-semantic approach.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Biomedical Informatics - Volume 46, Issue 2, April 2013, Pages 294–303
نویسندگان
, , ,