کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
379207 659274 2009 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
An active learning framework for semi-supervised document clustering with language modeling
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
An active learning framework for semi-supervised document clustering with language modeling
چکیده انگلیسی

This paper investigates a framework that actively selects informative document pairs for obtaining user feedback for semi-supervised document clustering. A gain-directed document pair selection method that measures how much we can learn by revealing judgments of selected document pairs is designed. We use the estimation of term co-occurrence probabilities as a clue for finding informative document pairs. Term co-occurrence probabilities are considered in the semi-supervised document clustering process to capture term-to-term dependence relationships. In the semi-supervised document clustering, each cluster is represented by a language model. We have conducted extensive experiments on several real-world corpora. The results demonstrate that our proposed framework is effective.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Data & Knowledge Engineering - Volume 68, Issue 1, January 2009, Pages 49–67
نویسندگان
, ,