کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
1114768 1488412 2014 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A Methodology for Building Simple but Robust Stemmers without Language Knowledge: Stemmer Configuration
موضوعات مرتبط
علوم انسانی و اجتماعی علوم انسانی و هنر هنر و علوم انسانی (عمومی)
پیش نمایش صفحه اول مقاله
A Methodology for Building Simple but Robust Stemmers without Language Knowledge: Stemmer Configuration
چکیده انگلیسی

This work is part of a project aiming to define a methodology for building simple but robust stemmers, without having knowledge of the stemmer's target language. The methodology starts with a very simple primary stemmer that is applied in some collection of words and returns the corresponding stems. The primary stemmer removes always the longest suffix that match the ending of the examined word. Next, Information Retrieval (IR) experts express their arguments against the results of the primary stemmer. This methodology allows the creation of a number of consecutive trial stemmers that gradually conform increasingly to the arguments expressed by the IR experts. Here, we are giving attention to the attributes and the adjusted characteristics/options that are available to the responsible person for building the consecutive trial stemmers and finally creating the best trial (the stemmer that respects as much as possible the arguments against the primary stemmer).

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia - Social and Behavioral Sciences - Volume 147, 25 August 2014, Pages 370-375