کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6856848 | 1437971 | 2018 | 12 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Entity mention aware document representation
ترجمه فارسی عنوان
ذینفع نمایندگی سند را آشکار می کند
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
نمایندگی توزیع شده، خوشه بندی متن، طبقه بندی متن، پیوند عضو
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
هوش مصنوعی
چکیده انگلیسی
Representing variable length texts (e.g., sentences, documents) with low-dimensional continuous vectors has been a topic of recent interest due to its successful applications in various NLP tasks. During the learning process, most of existing methods tend to treat all the words equally regardless of their possibly different intrinsic nature. We believe that for some types of documents (e.g., news articles), entity mentions are more informative than ordinary words and it can be beneficial for certain tasks if they are properly utilized. In this paper, we propose a novel approach for learning low-dimensional vector representations of documents. The learned representations captures information of not only the words in documents, but also the entity mentions in documents and the connections between different entities. Experimental results demonstrate that our approach is able to significantly improve text clustering, text classification performance and outperform previous studies on the TAC-KBP entity linking benchmark.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volumes 430â431, March 2018, Pages 216-227
Journal: Information Sciences - Volumes 430â431, March 2018, Pages 216-227
نویسندگان
Hongliang Dai, Siliang Tang, Fei Wu, Yueting Zhuang,