کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
11028870 | 1646701 | 2019 | 47 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Clinical text classification research trends: Systematic literature review and open issues
ترجمه فارسی عنوان
روند تحقیقات طبقه بندی متن بالینی: بررسی ادبیات سیستماتیک و مسائل باز
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
طبقه بندی متن بالینی، مهندسی ویژگی، نظارت بر یادگیری ماشین، طبقه بندی متن مبتنی بر قانون، معیارهای عملکرد،
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
هوش مصنوعی
چکیده انگلیسی
The pervasive use of electronic health databases has increased the accessibility of free-text clinical reports for supplementary use. Several text classification approaches, such as supervised machine learning (SML) or rule-based approaches, have been utilized to obtain beneficial information from free-text clinical reports. In recent years, many researchers have worked in the clinical text classification field and published their results in academic journals. However, to the best of our knowledge, no comprehensive systematic literature review (SLR) has recapitulated the existing primary studies on clinical text classification in the last five years. Thus, the current study aims to present SLR of academic articles on clinical text classification published from January 2013 to January 2018. Accordingly, we intend to maximize the procedural decision analysis in six aspects, namely, types of clinical reports, data sets and their characteristics, pre-processing and sampling techniques, feature engineering, machine learning algorithms, and performance metrics. To achieve our objective, 72 primary studies from 8 bibliographic databases were systematically selected and rigorously reviewed from the perspective of the six aspects. This review identified nine types of clinical reports, four types of data sets (i.e., homogeneous-homogenous, homogenous-heterogeneous, heterogeneous-homogenous, and heterogeneous-heterogeneous), two sampling techniques (i.e., over-sampling and under-sampling), and nine pre-processing techniques. Moreover, this review determined bag of words, bag of phrases, and bag of concepts features when represented by either term frequency or term frequency with inverse document frequency, thereby showing improved classification results. SML-based or rule-based approaches were generally employed to classify the clinical reports. To measure the performance of these classification approaches, we used precision, recall, F-measure, accuracy, AUC, and specificity in binary class problems. In multi-class problems, we primarily used micro or macro-averaging precision, recall, or F-measure. Lastly, open research issues and challenges are presented for future scholars who are interested in clinical text classification. This SLR will definitely be a beneficial resource for researchers engaged in clinical text classification.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 116, February 2019, Pages 494-520
Journal: Expert Systems with Applications - Volume 116, February 2019, Pages 494-520
نویسندگان
Ghulam Mujtaba, Liyana Shuib, Norisma Idris, Wai Lam Hoo, Ram Gopal Raj, Kamran Khowaja, Khairunisa Shaikh, Henry Friday Nweke,