Article ID Journal Published Year Pages File Type
535782 Pattern Recognition Letters 2012 8 Pages PDF
Abstract

In machine learning and statistics, kernel density estimators are rarely used on multivariate data due to the difficulty of finding an appropriate kernel bandwidth to overcome overfitting. However, the recent advances on information-theoretic learning have revived the interest on these models. With this motivation, in this paper we revisit the classical statistical problem of data-driven bandwidth selection by cross-validation maximum likelihood for Gaussian kernels. We find a solution to the optimization problem under both the spherical and the general case where a full covariance matrix is considered for the kernel. The fixed-point algorithms proposed in this paper obtain the maximum likelihood bandwidth in few iterations, without performing an exhaustive bandwidth search, which is unfeasible in the multivariate case. The convergence of the methods proposed is proved. A set of classification experiments are performed to prove the usefulness of the obtained models in pattern recognition.

► We provide novel algorithms for bandwidth selection in kernel density estimators. ► Two cases are explored: spherical and general–unconstrained–Gaussian kernel. ► It is the first method to reach a maximum-likelihood solution for the general case. ► The convergence of both algorithms is proved. ► The optimized KDEs show a good classification performance on real data.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, ,