Distributing Efficiently the Block-Max WAND Algorithm

Article ID	Journal	Published Year	Pages	File Type
490485	Procedia Computer Science	2013	10 Pages	PDF

Abstract

Large search engines are complex systems composed by several services. Each service is composed by a set of distributed processing nodes dedicated to execute a single operation required to user queries. One of these services is in charge of computing the top-k document results for queries by means of a document ranking operation. This ranking service is a major bottleneck in efficient query processing as billions of documents has to be processed each day. To answer user queries within a fraction of a second, techniques such as the Block-Max WAND algorithm are used to avoid fully processing all documents related to a query. In this work, we propose to efficiently distributing the Block-Max WAND computation among the ranking service processing nodes. Our proposal is devised to reduce memory usage and computation cost by assuming that each one of the P ranking processing nodes provide top-K/P + α documents results, where α is an estimation parameter which is dynamically set for each query. The experimental results show that the proposed approach significantly reduces execution time compared against current approaches used in search engines.