کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
515070 | 866945 | 2009 | 15 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
The bootstrapping of the Yarowsky algorithm in real corpora
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
![عکس صفحه اول مقاله: The bootstrapping of the Yarowsky algorithm in real corpora The bootstrapping of the Yarowsky algorithm in real corpora](/preview/png/515070.png)
چکیده انگلیسی
The Yarowsky bootstrapping algorithm resolves the homograph-level word sense disambiguation (WSD) problem, which is the sense granularity level required for real natural language processing (NLP) applications. At the same time it resolves the knowledge acquisition bottleneck problem affecting most WSD algorithms and can be easily applied to foreign language corpora. However, this paper shows that the Yarowsky algorithm is significantly less accurate when applied to domain fluctuating, real corpora. This paper also introduces a new bootstrapping methodology that performs much better when applied to these corpora. The accuracy achieved in non-domain fluctuating corpora is not reached due to inherent domain fluctuation ambiguities.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 45, Issue 1, January 2009, Pages 55–69
Journal: Information Processing & Management - Volume 45, Issue 1, January 2009, Pages 55–69
نویسندگان
Ricardo Sánchez-de-Madariaga, José R. Fernández-del-Castillo,