Building Sense Tagged Corpus Using Wikipedia for Supervised Word Sense Disambiguation

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
6900927	1446491	2018	10 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)

پیش نمایش صفحه اول مقاله

Building Sense Tagged Corpus Using Wikipedia for Supervised Word Sense Disambiguation

چکیده انگلیسی

Building of sense-tagged data is a main challenge for supervised techniques that achieved promising results in word sense disambiguation. The manual building of sense-tagged data is a labor and a time-consuming task because each ambiguous word has to be labeled in collected contexts by linguistic experts. Therefore, this paper proposes a knowledge-based method for building the Arabic sense-tagged corpus from Wikipedia. The method starts with mapping Arabic WordNet and Wikipedia to select the Wikipedia article for the corresponding sense in WordNet. In this mapping step, the cross-lingual method is used to measure the similarity between features of a Wikipedia article and a WordNet sense separately. Then, the incoming-links of Wikipedia articles are exploited to extract instances for the sense of each ambiguous word in WordNet. For handling the lack of instances of some articles in Wikipedia, the multiword-based technique is proposed to increase a number of instances for each concept. Experimental results show that the cross-lingual method outperforms monolingual method that is based on Arabic features only. The sense-tagged corpus is created for 50 ambiguous words yielding 148 senses with 30,961 instances.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 123, 2018, Pages 403-412

نویسندگان

Abdulgabbar Saif, Nazlia Omar, Ummi Zakiah Zainodin, Mohd Juziaddin Ab Aziz,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Building Sense Tagged Corpus Using Wikipedia for Supervised Word Sense Disambiguation

دسترسی سریع

ارتباط

English Website