کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
377878 658844 2011 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Multiple kernel learning in protein–protein interaction extraction from biomedical literature
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Multiple kernel learning in protein–protein interaction extraction from biomedical literature
چکیده انگلیسی

ObjectiveKnowledge about protein–protein interactions (PPIs) unveils the molecular mechanisms of biological processes. The volume and content of published biomedical literature on protein interactions is expanding rapidly, making it increasingly difficult for interaction database administrators, responsible for content input and maintenance to detect and manually update protein interaction information. The objective of this work is to develop an effective approach to automatic extraction of PPI information from biomedical literature.Methods and materialsWe present a weighted multiple kernel learning-based approach for automatic PPI extraction from biomedical literature. The approach combines the following kernels: feature-based, tree, graph and part-of-speech (POS) path. In particular, we extend the shortest path-enclosed tree (SPT) and dependency path tree to capture richer contextual information.ResultsOur experimental results show that the combination of SPT and dependency path tree extensions contributes to the improvement of performance by almost 0.7 percentage units in F-score and 2 percentage units in area under the receiver operating characteristics curve (AUC). Combining two or more appropriately weighed individual will further improve the performance. Both on the individual corpus and cross-corpus evaluation our combined kernel can achieve state-of-the-art performance with respect to comparable evaluations, with 64.41% F-score and 88.46% AUC on the AImed corpus.ConclusionsAs different kernels calculate the similarity between two sentences from different aspects. Our combined kernel can reduce the risk of missing important features. More specifically, we use a weighted linear combination of individual kernels instead of assigning the same weight to each individual kernel, thus allowing the introduction of each kernel to incrementally contribute to the performance improvement. In addition, SPT and dependency path tree extensions can improve the performance by including richer context information.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Artificial Intelligence in Medicine - Volume 51, Issue 3, March 2011, Pages 163–173
نویسندگان
, , , , , ,