کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
385414 660865 2011 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Experiments in term weighting for novelty mining
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Experiments in term weighting for novelty mining
چکیده انگلیسی

Obtaining new information in a short time is becoming crucial in today’s economy. A lot of information both offline or online is easily acquired, exacerbating the problem of information overload. Novelty mining detects documents/sentences that contain novel or new information and presents those results directly to users (Tang, Tsai, & Chen, 2010). Many methods and algorithms for novelty mining have previously been studied, but none have compared and discussed the impact of term weighting on the evaluation measures. This paper performed experiments to recommend the best term weighting function for both document and sentence-level novelty mining.


► Experiments were performed to recommend the best term weighting function for document and sentence-level novelty mining.
► Binary was the best overall term weighting function for document-level novelty mining.
► TF.IDF was the best term weighting function for sentence-level novelty mining.
► For datasets with a low percentage of novel documents, TF outperformed the binary term weighting function.
► For data with a high percentage of novel documents, TF.IDF outperformed TF on the high-precision cases.
► These results can be used as guidelines for choosing the best term weighting function for novelty mining across a broad range of data.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 38, Issue 11, October 2011, Pages 14094–14101
نویسندگان
, ,