کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4942169 | 1436992 | 2016 | 36 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Nasari: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities
ترجمه فارسی عنوان
نصاری: ادغام دانش صریح و آمار جامع برای نمایش چندزبانه مفاهیم و نهادها
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
نمایش معنایی، معانی واژگانی، یکنواختی کلمه معنی، شباهت معنایی، خوشه حس برچسب گذاری دامنه،
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
هوش مصنوعی
چکیده انگلیسی
Owing to the need for a deep understanding of linguistic items, semantic representation is considered to be one of the fundamental components of several applications in Natural Language Processing and Artificial Intelligence. As a result, semantic representation has been one of the prominent research areas in lexical semantics over the past decades. However, due mainly to the lack of large sense-annotated corpora, most existing representation techniques are limited to the lexical level and thus cannot be effectively applied to individual word senses. In this paper we put forward a novel multilingual vector representation, called Nasari, which not only enables accurate representation of word senses in different languages, but it also provides two main advantages over existing approaches: (1) high coverage, including both concepts and named entities, (2) comparability across languages and linguistic levels (i.e., words, senses and concepts), thanks to the representation of linguistic items in a single unified semantic space and in a joint embedded space, respectively. Moreover, our representations are flexible, can be applied to multiple applications and are freely available at http://lcl.uniroma1.it/nasari/. As evaluation benchmark, we opted for four different tasks, namely, word similarity, sense clustering, domain labeling, and Word Sense Disambiguation, for each of which we report state-of-the-art performance on several standard datasets across different languages.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Artificial Intelligence - Volume 240, November 2016, Pages 36-64
Journal: Artificial Intelligence - Volume 240, November 2016, Pages 36-64
نویسندگان
José Camacho-Collados, Mohammad Taher Pilehvar, Roberto Navigli,