A novel term weighting scheme based on discrimination power obtained from past retrieval results

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
515642	867057	2012	12 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Discrimination power - قدرت تبعیض Probabilistic model - مدل احتمالاتی Language model - مدل زبان Term weighting - مقیاس مدت

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر

پیش نمایش صفحه اول مقاله

A novel term weighting scheme based on discrimination power obtained from past retrieval results

چکیده انگلیسی

Term weighting for document ranking and retrieval has been an important research topic in information retrieval for decades. We propose a novel term weighting method based on a hypothesis that a term’s role in accumulated retrieval sessions in the past affects its general importance regardless. It utilizes availability of past retrieval results consisting of the queries that contain a particular term, retrieved documents, and their relevance judgments. A term’s evidential weight, as we propose in this paper, depends on the degree to which the mean frequency values for the relevant and non-relevant document distributions in the past are different. More precisely, it takes into account the rankings and similarity values of the relevant and non-relevant documents. Our experimental result using standard test collections shows that the proposed term weighting scheme improves conventional TF*IDF and language model based schemes. It indicates that evidential term weights bring in a new aspect of term importance and complement the collection statistics based on TF*IDF. We also show how the proposed term weighting scheme based on the notion of evidential weights are related to the well-known weighting schemes based on language modeling and probabilistic models.

► We propose a term weighting scheme based on discrimination power.
► Discrimination power uses rankings and similarity values.
► Discrimination power outperforms TF*IDF and language model based schemes.
► We show how discrimination power is related to language modeling and probabilistic models.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 48, Issue 5, September 2012, Pages 919–930

نویسندگان

Sa-kwang Song, Sung Hyon Myaeng,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

A novel term weighting scheme based on discrimination power obtained from past retrieval results

دسترسی سریع

ارتباط

English Website