Article ID Journal Published Year Pages File Type
418571 Discrete Applied Mathematics 2015 22 Pages PDF
Abstract

The well-known kk-nearest neighbors problem (kNN) involves building a data structure that reports the kk closest training points to each of a given set of query points, with all points being in a given metric space SS. The problem discussed here is an important operation in rotation-invariant image processing. It consists of a nontrivial variant of kNN: given a set of training points XX and a set of query points YY find, for each query point y∈Yy∈Y, the kk nearest training points to yy, where the notion of distance is given by a pseudometric of SS defined over cyclic permutations of yy and the elements of XX. The multiplicity of the query point permutations makes serial brute force search too costly for instances of practical size. We present a transformation that enables any instance of the variant to be solved as a kNN problem. Although this enables the application of any kNN algorithm, the transformation is too time costly to be practical for instances of practical dimensions. For this reason, we present a condensation algorithm for the efficient elimination of unfavorable training points (that cannot be among the kk closest neighbors of yy) and an effective parallel programming approach based on the discrete Fourier transform. The significant speedup gained over brute force search on practical datasets is reported. We also provide the mathematical basis of a conjecture that, if true, would enable the speedup to be significantly improved. The application of the approach to classification by support vector machines and kk-means clustering is also discussed.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , , , ,