کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6926958 1449119 2014 51 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Text mining of cancer-related information: Review of current status and future directions
ترجمه فارسی عنوان
استخراج متن اطلاعات مربوط به سرطان: مرور وضعیت فعلی و مسیرهای آینده
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی
A range of studies have proven the feasibility of TM for extracting structured information from clinical narratives such as those found in pathology or radiology reports. In this article, we provide a critical overview of the current state of the art for TM related to cancer. The review highlighted a strong bias towards symbolic methods, e.g. named entity recognition (NER) based on dictionary lookup and information extraction (IE) relying on pattern matching. The F-measure of NER ranges between 80% and 90%, while that of IE for simple tasks is in the high 90s. To further improve the performance, TM approaches need to deal effectively with idiosyncrasies of the clinical sublanguage such as non-standard abbreviations as well as a high degree of spelling and grammatical errors. This requires a shift from rule-based methods to machine learning following the success of similar trends in biological applications of TM. Machine learning approaches require large training datasets, but clinical narratives are not readily available for TM research due to privacy and confidentiality concerns. This issue remains the main bottleneck for progress in this area. In addition, there is a need for a comprehensive cancer ontology that would enable semantic representation of textual information found in narrative reports.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: International Journal of Medical Informatics - Volume 83, Issue 9, September 2014, Pages 605-623
نویسندگان
, , , ,