Assembling approximately optimal binary search trees efficiently using arithmetics

Article ID	Journal	Published Year	Pages	File Type
428054	Information Processing Letters	2009	5 Pages	PDF

Abstract

We introduce a new algorithm for computing an approximately optimal binary search tree with known access probabilities or weights on items. The algorithm is simple to implement and it has two contributions. First, a randomized variant of the algorithm produces a binary search tree with expected performance that improves the previous theoretical guarantees (the performance is dependent on the value of the input random variable). More precisely, if p is the probability of accessing an item, then under expectation the item is found after searching lg1/p+0.087+lg2(1+pmax) nodes, where pmax is the maximal probability among items. The previous best bound was lg1/p+1, albeit deterministic. For the optimal tree our upper bound implies a non-constructive performance bound of H+0.087+lg2(1+pmax), where H is the entropy on the item distribution and the previous bound was H+1. The second contribution of the algorithm is a low cost in i/o models of cost such as the cache-oblivious model, while attaining simultaneously the above bound for the produced tree.