Protein–protein interaction extraction by leveraging multiple kernels and parsers

Article ID	Journal	Published Year	Pages	File Type
516344	International Journal of Medical Informatics	2009	8 Pages	PDF

Abstract

Protein–protein interaction (PPI) extraction is an important and widely researched task in the biomedical natural language processing (BioNLP) field. Kernel-based machine learning methods have been used widely to extract PPI automatically, and several kernels focusing on different parts of sentence structure have been published for the PPI task. In this paper, we propose a method to combine kernels based on several syntactic parsers, in order to retrieve the widest possible range of important information from a given sentence. We evaluate the method using a support vector machine (SVM), and we achieve better results than other state-of-the-art PPI systems on four out of five corpora. Further, we analyze the compatibility of the five corpora from the viewpoint of PPI extraction, and we see that some of them have small incompatibilities, but they can still be combined with a little effort.

Keywords

Relation extraction Support vector machine Machine learning