کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
379085 659261 2010 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Sentence identification of biological interactions using PATRICIA tree generated patterns and genetic algorithm optimized parameters
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Sentence identification of biological interactions using PATRICIA tree generated patterns and genetic algorithm optimized parameters
چکیده انگلیسی

An important task in information retrieval is to identify sentences that contain important relationships between key concepts. In this work, we propose a novel approach to automatically extract sentence patterns that contain interactions involving concepts of molecular biology. A pattern is defined in this work as a sequence of specialized Part-of-Speech (POS) tags that capture the structure of key sentences in the scientific literature. Each candidate sentence for the classification task is encoded as a POS array and then aligned to a collection of pre-extracted patterns. The quality of the alignment is expressed as a pairwise alignment score. The most innovative component of this work is the use of a genetic algorithm (GA) to maximize the classification performance of the alignment scoring scheme. The system achieves an average F-score of 0.796 in identifying sentences which describe interactions between co-occurring biological concepts. This performance is mostly affected by the quality of the preprocessing steps such as term identification and POS tagging.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Data & Knowledge Engineering - Volume 69, Issue 1, January 2010, Pages 137–152
نویسندگان
, , ,