Developing a standard for de-identifying electronic patient records written in Swedish: Precision, recall and F-measure in a manual and computerized annotation trial

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
516341	1449176	2009	8 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Medical informatics applications - برنامه های کاربردی پزشکی medical record systems - مدارک پزشکی Natural Language Processing - پردازش زبان‌های طبیعی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر

پیش نمایش صفحه اول مقاله

Developing a standard for de-identifying electronic patient records written in Swedish: Precision, recall and F-measure in a manual and computerized annotation trial

چکیده انگلیسی

BackgroundElectronic patient records (EPRs) contain a large amount of information written in free text. This information is considered very valuable for research but is also very sensitive since the free text parts may contain information that could reveal the identity of a patient. Therefore, methods for de-identifying EPRs are needed. The work presented here aims to perform a manual and automatic Protected Health Information (PHI)-annotation trial for EPRs written in Swedish.MethodsThis study consists of two main parts: the initial creation of a manually PHI-annotated gold standard, and the porting and evaluation of an existing de-identification software written for American English to Swedish in a preliminary automatic de-identification trial. Results are measured with precision, recall and F-measure.ResultsThis study reports fairly high Inter-Annotator Agreement (IAA) results on the manually created gold standard, especially for specific tags such as names. The average IAA over all tags was 0.65 F-measure (0.84 F-measure highest pairwise agreement). For name tags the average IAA was 0.80 F-measure (0.91 F-measure highest pairwise agreement). Porting a de-identification software written for American English to Swedish directly was unfortunately non-trivial, yielding poor results.ConclusionDeveloping gold standard sets as well as automatic systems for de-identification tasks in Swedish is feasible. However, discussions and definitions on identifiable information is needed, as well as further developments both on the tag sets and the annotation guidelines, in order to get a reliable gold standard. A completely new de-identification software needs to be developed.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: International Journal of Medical Informatics - Volume 78, Issue 12, December 2009, Pages e19–e26

نویسندگان

Sumithra Velupillai, Hercules Dalianis, Martin Hassel, Gunnar H. Nilsson,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Developing a standard for de-identifying electronic patient records written in Swedish: Precision, recall and F-measure in a manual and computerized annotation trial

دسترسی سریع

ارتباط

English Website