کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
377790 658829 2011 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Terminological resources for text mining over biomedical scientific literature
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Terminological resources for text mining over biomedical scientific literature
چکیده انگلیسی

SummaryObjectiveWe present a combined terminological resource for text mining over biomedical literature. The purpose of the resource is to allow the detection of mentions of specific biological entities in scientific publications, and their grounding to widely accepted identifiers. This is an essential process, useful in itself, and necessary as an intermediate step for almost every type of complex text mining application.MethodsWe discuss some of the properties of the terminology for this domain, in particular the degree of ambiguity, which constitutes a peculiar problem for text mining applications. Without a correct recognition and disambiguation of the domain entities no reliable results can be produced.ResultsWe also discuss an application that makes use of the resulting terminological knowledge base. We annotate an existing corpus of sentences about protein interactions. The annotation consists of a normalization step that matches the terms in our resource with their actual representation in the corpus, and a disambiguation step that resolves the ambiguity of matched terms.ConclusionIn this paper we present a large terminological resource, compiled through the aggregation of a number of different manually curated sources. We discuss the lexical properties of such resources, specifically the degree of ambiguity of the terms, and we inspect the causes of such ambiguity, in particular for protein names. This information is of vital importance for the implementation of an efficient term normalization and grounding algorithm.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Artificial Intelligence in Medicine - Volume 52, Issue 2, June 2011, Pages 107–114
نویسندگان
, , ,