Stacked ensemble coupled with feature selection for biomedical entity extraction

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
405188	677504	2013	11 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Conditional Random Field (CRF)Support vector machine (SVM) - ماشین بردار پشتیبانی (SVM)

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

Stacked ensemble coupled with feature selection for biomedical entity extraction

چکیده انگلیسی

Entity extraction is one of the most fundamental and important tasks in biomedical information extraction. In this paper we propose a two-stage algorithm for the extraction of biomedical entities in the forms of genes and gene product mentions in text. Several different approaches have emerged but most of these state-of-the-art approaches suggest that individual system may not cover entity representations with arbitrary set of features and cannot achieve best performance. We identify and implement a diverse set of features which are relevant for the identification of biomedical entities and classification of them into some predefined categories. One most important criterion of these features is that these are identified and selected largely without using any domain knowledge. In the first stage we use a genetic algorithm (GA) based feature selection technique to determine the most relevant set of features for Support Vector Machine (SVM) and Conditional Random Field (CRF) classifiers. The GA based feature selection algorithm produces best population that can be used to generate different classification models based on CRF and SVM. In the second stage we develop a stacked based ensemble to combine the classifiers selected in the first stage. The proposed approach is evaluated on two benchmark datasets, namely JNLPBA 2004 shared task and GENETAG. The proposed approach yields the overall F-measure values of 75.17% and 94.70% for JNLPBA 2004 and GENETAG data sets, respectively.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 46, July 2013, Pages 22–32

نویسندگان

Asif Ekbal, Sriparna Saha,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Stacked ensemble coupled with feature selection for biomedical entity extraction

دسترسی سریع

ارتباط

English Website