کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
517356 | 1449208 | 2007 | 6 صفحه PDF | دانلود رایگان |
PurposeWe present a method for the automated acquisition of a multilingual medical lexicon (for Spanish, French and Swedish) to be used within the framework of a medical cross-language text retrieval system.MethodsFor the lexical acquisition process, we incorporate seed lexicons and lists of trusted term translations derived from the UMLS Metathesaurus. The seed lexicons for Spanish, French and Swedish are automatically generated from (previously manually constructed) Portuguese, German and English sources by simple string transformations. Lexical and semantic hypotheses are then validated by processing pairs of term translations. In a last step, we use the cleaned list of “approved” translations in order to augment, step by step, the target dictionaries by processing the parallel corpora in terms of co-occurrence patterns of hypothesized translation equivalents which cannot be derived by simple character substitutions.ResultsAn existing multilingual lexicon for the medical domain with about 60,000 entries for English, German, and Portuguese was automatically augmented by more then 17,000 new lexemes for Spanish, French, and Swedish.ConclusionsOur approach constitutes a promising method for the automated creation of new lexicon entries and their linkage to semantic identifiers.
Journal: International Journal of Medical Informatics - Volume 76, Issues 2–3, February–March 2007, Pages 184–189