کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
535782 | 870379 | 2012 | 8 صفحه PDF | دانلود رایگان |
In machine learning and statistics, kernel density estimators are rarely used on multivariate data due to the difficulty of finding an appropriate kernel bandwidth to overcome overfitting. However, the recent advances on information-theoretic learning have revived the interest on these models. With this motivation, in this paper we revisit the classical statistical problem of data-driven bandwidth selection by cross-validation maximum likelihood for Gaussian kernels. We find a solution to the optimization problem under both the spherical and the general case where a full covariance matrix is considered for the kernel. The fixed-point algorithms proposed in this paper obtain the maximum likelihood bandwidth in few iterations, without performing an exhaustive bandwidth search, which is unfeasible in the multivariate case. The convergence of the methods proposed is proved. A set of classification experiments are performed to prove the usefulness of the obtained models in pattern recognition.
► We provide novel algorithms for bandwidth selection in kernel density estimators.
► Two cases are explored: spherical and general–unconstrained–Gaussian kernel.
► It is the first method to reach a maximum-likelihood solution for the general case.
► The convergence of both algorithms is proved.
► The optimized KDEs show a good classification performance on real data.
Journal: Pattern Recognition Letters - Volume 33, Issue 13, 1 October 2012, Pages 1717–1724