A randomized approximate nearest neighbors algorithm

Article ID	Journal	Published Year	Pages	File Type
4605140	Applied and Computational Harmonic Analysis	2013	30 Pages	PDF

Abstract

We present a randomized algorithm for the approximate nearest neighbor problem in d-dimensional Euclidean space. Given N points {xj} in Rd, the algorithm attempts to find k nearest neighbors for each of xj, where k is a user-specified integer parameter. The algorithm is iterative, and its CPU time requirements are proportional to , with T the number of iterations performed. The memory requirements of the procedure are of the order N⋅(d+k).A byproduct of the scheme is a data structure, permitting a rapid search for the k nearest neighbors among {xj} for an arbitrary point x∈Rd. The cost of each such query is proportional to , and the memory requirements for the requisite data structure are of the order N⋅(d+k)+T⋅(d+N).The algorithm utilizes random rotations and a basic divide-and-conquer scheme, followed by a local graph search. We analyze the schemeʼs behavior for normally distributed points {xj}, and illustrate its performance via several numerical examples.