کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
405321 | 677530 | 2011 | 8 صفحه PDF | دانلود رایگان |

Due to the rapid growth of free text documents available in digital form, efficient techniques of automatic categorization are of great importance. In this paper, we present an efficient rule-based method for categorizing free text documents. The contributions of this research are the formation of lexical syntactic patterns as basic classification features, a categorization framework that addresses the problem of classifying free text with minimal label description, and an efficient learning algorithm in terms of time complexity and F-measure. The framework of ROLEX-SP concentrates on capturing the correct classes of text as well as reducing classification errors.We performed experiments in order to evaluate the proposed method and compare our work with state-of-the-art methods in domain specific source of knowledge. The results indicate that ROLEX-SP outperforms other methods in terms of standard F-measure in medical domain because of the strong definition of MeSH description of medical categories.
Journal: Knowledge-Based Systems - Volume 24, Issue 1, February 2011, Pages 58–65