Learning a taxonomy from a set of text documents

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
496718	862868	2012	11 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Keyphrase extraction - استخراج کلیدی Knowledge representation - بازنمایی دانش Document clustering - خوشه بندی مستند Fuzzy logic - منطق فازی Self-organizing map - نقشه خودسازمانده Multilinguality - چند زبانه

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر

پیش نمایش صفحه اول مقاله

چکیده انگلیسی

We present a methodology for learning a taxonomy from a set of text documents that each describes one concept. The taxonomy is obtained by clustering the concept definition documents with a hierarchical approach to the Self-Organizing Map. In this study, we compare three different feature extraction approaches with varying degree of language independence. The feature extraction schemes include fuzzy logic-based feature weighting and selection, statistical keyphrase extraction, and the traditional tf-idf weighting scheme. The experiments are conducted for English, Finnish, and Spanish. The results show that while the rule-based fuzzy logic systems have an advantage in automatic taxonomy learning, taxonomies can also be constructed with tolerable results using statistical methods without domain- or style-specific knowledge.

► We learn a taxonomy from a set of encyclopedia articles.
► A hierarchical approach to the Self-Organizing Map is used to cluster the documents.
► Experiments are conducted for English, Finnish and Spanish.
► Rule-based systems have an advantage in automatic taxonomy learning.
► Taxonomies can be constructed with good results also using statistical methods.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Applied Soft Computing - Volume 12, Issue 3, March 2012, Pages 1138–1148

نویسندگان

Mari-Sanna Paukkeri, Alberto Pérez García-Plaza, Víctor Fresno, Raquel Martínez Unanue, Timo Honkela,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Learning a taxonomy from a set of text documents

دسترسی سریع

ارتباط

English Website