کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
534873 870297 2011 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Pairwise optimized Rocchio algorithm for text categorization
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Pairwise optimized Rocchio algorithm for text categorization
چکیده انگلیسی

This paper examines the Rocchio algorithm and its application in text categorization. Existing approaches using global parameters optimization of Rocchio algorithm result in choosing one fixed prototype representing each category for multi-category text categorization problems. Therefore, they have limited discriminating power on different category’s distribution and their parameter optimization methods are based on weak representation ability of the negative samples consisting of several categories. We present a pairwise optimized Rocchio algorithm, which dynamically adjusts the prototype position between pairs of categories. Experiments were conducted on three benchmark corpora, the 20-Newsgroup, Reuters-21578 and TDT2. The results confirm that our proposed pairwise method achieves encouraging performance improvement over the conventional Rocchio method. A comparative study with the top notch text classifier Support Vector Machine (SVM) also shows the pairwise Rocchio method achieves competitive results.

Research Highlights
► Conventional Rocchio algorithm has weak representing ability by choosing one fixed prototype for each category.
► The pairwise optimized method dynamically adjusts the prototype position between pairs of categories.
► Text categorization experiments were conducted on three benchmark corpora, the 20-Newsgroup, Reuters-21578, and TDT2.
► The results confirm that the proposed pairwise method achieves encouraging performance improvement over the conventional Rocchio method, and demonstrates competitive with SVM.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 32, Issue 2, 15 January 2011, Pages 375–382
نویسندگان
, ,