کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
517356 1449208 2007 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Automatic lexeme acquisition for a multilingual medical subword thesaurus
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Automatic lexeme acquisition for a multilingual medical subword thesaurus
چکیده انگلیسی

PurposeWe present a method for the automated acquisition of a multilingual medical lexicon (for Spanish, French and Swedish) to be used within the framework of a medical cross-language text retrieval system.MethodsFor the lexical acquisition process, we incorporate seed lexicons and lists of trusted term translations derived from the UMLS Metathesaurus. The seed lexicons for Spanish, French and Swedish are automatically generated from (previously manually constructed) Portuguese, German and English sources by simple string transformations. Lexical and semantic hypotheses are then validated by processing pairs of term translations. In a last step, we use the cleaned list of “approved” translations in order to augment, step by step, the target dictionaries by processing the parallel corpora in terms of co-occurrence patterns of hypothesized translation equivalents which cannot be derived by simple character substitutions.ResultsAn existing multilingual lexicon for the medical domain with about 60,000 entries for English, German, and Portuguese was automatically augmented by more then 17,000 new lexemes for Spanish, French, and Swedish.ConclusionsOur approach constitutes a promising method for the automated creation of new lexicon entries and their linkage to semantic identifiers.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: International Journal of Medical Informatics - Volume 76, Issues 2–3, February–March 2007, Pages 184–189
نویسندگان
, , ,