کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
515642 867057 2012 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A novel term weighting scheme based on discrimination power obtained from past retrieval results
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
A novel term weighting scheme based on discrimination power obtained from past retrieval results
چکیده انگلیسی

Term weighting for document ranking and retrieval has been an important research topic in information retrieval for decades. We propose a novel term weighting method based on a hypothesis that a term’s role in accumulated retrieval sessions in the past affects its general importance regardless. It utilizes availability of past retrieval results consisting of the queries that contain a particular term, retrieved documents, and their relevance judgments. A term’s evidential weight, as we propose in this paper, depends on the degree to which the mean frequency values for the relevant and non-relevant document distributions in the past are different. More precisely, it takes into account the rankings and similarity values of the relevant and non-relevant documents. Our experimental result using standard test collections shows that the proposed term weighting scheme improves conventional TF*IDF and language model based schemes. It indicates that evidential term weights bring in a new aspect of term importance and complement the collection statistics based on TF*IDF. We also show how the proposed term weighting scheme based on the notion of evidential weights are related to the well-known weighting schemes based on language modeling and probabilistic models.


► We propose a term weighting scheme based on discrimination power.
► Discrimination power uses rankings and similarity values.
► Discrimination power outperforms TF*IDF and language model based schemes.
► We show how discrimination power is related to language modeling and probabilistic models.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 48, Issue 5, September 2012, Pages 919–930
نویسندگان
, ,