Using a shallow linguistic kernel for drug–drug interaction extraction

Article ID	Journal	Published Year	Pages	File Type
518258	Journal of Biomedical Informatics	2011	16 Pages	PDF

Abstract

A drug–drug interaction (DDI) occurs when one drug influences the level or activity of another drug. Information Extraction (IE) techniques can provide health care professionals with an interesting way to reduce time spent reviewing the literature for potential drug–drug interactions. Nevertheless, no approach has been proposed to the problem of extracting DDIs in biomedical texts. In this article, we study whether a machine learning-based method is appropriate for DDI extraction in biomedical texts and whether the results provided are superior to those obtained from our previously proposed pattern-based approach [1]. The method proposed here for DDI extraction is based on a supervised machine learning technique, more specifically, the shallow linguistic kernel proposed in Giuliano et al. (2006) [2]. Since no benchmark corpus was available to evaluate our approach to DDI extraction, we created the first such corpus, DrugDDI, annotated with 3169 DDIs. We performed several experiments varying the configuration parameters of the shallow linguistic kernel. The model that maximizes the F-measure was evaluated on the test data of the DrugDDI corpus, achieving a precision of 51.03%, a recall of 72.82% and an F-measure of 60.01%.To the best of our knowledge, this work has proposed the first full solution for the automatic extraction of DDIs from biomedical texts. Our study confirms that the shallow linguistic kernel outperforms our previous pattern-based approach. Additionally, it is our hope that the DrugDDI corpus will allow researchers to explore new solutions to the DDI extraction problem.

Graphical abstractOur goal is to develop an IE system to extract drug-drug interactions from biomedical texts. We use the DrugBank database as the source of unstructured textual information on drugs and their interactions. These texts are analyzed by the MetaMap tool that provides shallow syntactic and semantic information. Our system is based on a supervised machine learning approach, in particular, a shallow linguistic kernel-based approach that uses Support Vector Machines (SVM).Figure optionsDownload full-size imageDownload as PowerPoint slideHighlights► We propose the first full solution for the automatic extraction of drug–drug interactions (DDIs) from biomedical texts. ► We creates the first annotated corpus with DDIs in order to train and evaluate our system. ► Our system is on a shallow linguistic kernel.

Keywords

Patient safety Drug?drug interactions Unified Medical Language System Machine learning