کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
383506 | 660824 | 2015 | 10 صفحه PDF | دانلود رایگان |
• Language independent Named Entity Recognition system.
• Novel features based on latent semantics.
• Experiments on multiple languages – English, Spanish, Dutch, Czech.
• State-of-the-art results.
In this paper, we propose new features for Named Entity Recognition (NER) based on latent semantics. Furthermore, we explore the effect of unsupervised morphological information on these methods and on the NER system in general. The newly created NER system is fully language-independent thanks to the unsupervised nature of the proposed features. We evaluate the system on English, Spanish, Dutch and Czech corpora and study the difference between weakly and highly inflectional languages. Our system achieves the same or even better results than state-of-the-art language dependent systems. The proposed features proved to be very useful and are the main reason of our promising results.
Journal: Expert Systems with Applications - Volume 42, Issue 7, 1 May 2015, Pages 3470–3479