کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4969162 | 1449896 | 2018 | 16 صفحه PDF | دانلود رایگان |
- We study how to prioritize relevance assessments in the process of creating an Information Retrieval test collection.
- We propose effective fusion models, based on score distributions, and adapted to the characteristics of this evaluation task.
- Our formal approach leads to natural and competitive solutions when compared to state of the art methods.
- We also demonstrate the benefits of including pseudo-relevance evidence into the estimation of the score distribution models.
In this paper we study how to prioritize relevance assessments in the process of creating an Information Retrieval test collection. A test collection consists of a set of queries, a document collection, and a set of relevance assessments. For each query, only a sample of documents from the collection can be manually assessed for relevance. Multiple retrieval strategies are typically used to obtain such sample of documents. And rank fusion plays a fundamental role in creating the sample by combining multiple search results. We propose effective rank fusion models that are adapted to the characteristics of this evaluation task. Our models are based on the distribution of retrieval scores supplied by the search systems and our experiments show that this formal approach leads to natural and competitive solutions when compared to state of the art methods. We also demonstrate the benefits of including pseudo-relevance evidence into the estimation of the score distribution models.
Journal: Information Fusion - Volume 39, January 2018, Pages 56-71