کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
508854 865455 2016 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Text classification based filters for a domain-specific search engine
ترجمه فارسی عنوان
طبقه بندی متن بر اساس فیلتر برای یک موتور جستجوی خاص دامنه
کلمات کلیدی
موتورهای جستجو، طبقه بندی متن، یادداشت برداری، یادگیری فعال
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی


• Usage of text classification for filters in domain-specific search engines.
• Annotation study with the outcome of a new text corpus for evaluation of document classification.
• Insights into the deployment of new filters in search engines in a real application scenario.
• On- and off-line evaluation of the approach.
• Extensive study on the impact of the system's parameters.

Domain-specific search engines exist in various fields, providing additional value by exploiting knowledge of their respective domains. One common mechanism used are filters which allow narrowing down the search results based on pre-defined filter categories. In this article we exploit the usage of a text classification system for the creation of these filters. The approach is tailored to work in large-scale settings with reduced amounts of manually annotated training data and hence enables a cost-efficient roll-out of new filters. An initial annotation study resulted in a corpus which was used for an off-line evaluation of the approach giving insights into the effect of the system's parameters. Finally, a large online evaluation was executed together with a provider of a domain-specific search engine. This article presents important aspects that need to be taken into consideration when implementing text classification-based filters in the industrial setting of a domain-specific search engine.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers in Industry - Volume 78, May 2016, Pages 70–79
نویسندگان
, , ,