Article ID Journal Published Year Pages File Type
407025 Neurocomputing 2014 17 Pages PDF
Abstract

•Retrievability is used for analyzing retrieval bias of retrieval models.•Measuring retrievability requires processing of exhaustive number of queries.•We proposed query independent approach on the basis of document features in order to estimate the retrievability of documents.•We proposed three classes of features. These are surface level, terms weighting, and density around nearest neighbors of documents.•Experiments indicate that our approach predicts the retrievability of documents more efficiently with less processing time and fewer resources.

Retrievability is a measure of access that quantifies how easily documents can be found using a retrieval system. Such a measure is of particular interest within the recall oriented retrieval domains such as patent or legal retrieval. This is because if a retrieval system for these retrieval domains makes some documents hard to find then professional searchers would have a difficult time when retrieving these documents. One main limitation of retrievability analysis is that it depends upon the processing of exhaustive number of queries. This requires large processing time and resources. In order to handle this problem, in this paper we use document features based approach in order to estimate the retrievability ranks of documents. In experiments, the strong correlation between features and retrievability scores on different collections confirms that it is possible to estimate the retrievability ranks of documents without processing queries. One major advantage of this approach is that it requires fewer resources, and can be computed more quickly as compared to query based approach. While, on the other hand, one major disadvantage of this approach is that it can only estimate the retrievability ranks of documents, but cannot calculate how much there is retrievability inequality (retrieval bias) between the documents of collection.

Keywords
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
,