Article ID Journal Published Year Pages File Type
4604954 Applied and Computational Harmonic Analysis 2016 24 Pages PDF
Abstract

Diffusion-based kernel methods are commonly used for analyzing massive high dimensional datasets. These methods utilize a non-parametric approach to represent the data by using an affinity kernel that represents similarities, distances or correlations between data points. The kernel is based on a Markovian diffusion process, whose transition probabilities are determined by local distances between data points. Spectral analysis of this kernel provides a representation of the data, where Euclidean distances correspond to diffusion distances between data points. When the data lies on a low dimensional manifold, these diffusion distances encompass the geometry of the manifold. In this paper, we present a generalized approach for defining diffusion-based kernels by incorporating measure-based information, which represents the density or distribution of the data, together with its local distances. The generalized construction does not require an underlying manifold to provide a meaningful kernel interpretation but assumes a more relaxed assumption that the measure and its support are related to a locally low dimensional nature of the analyzed phenomena. This kernel is shown to satisfy the necessary spectral properties that are required in order to provide a low dimensional embedding of the data. The associated diffusion process is analyzed via its infinitesimal generator and the provided embedding is demonstrated in two geometric scenarios.

Related Topics
Physical Sciences and Engineering Mathematics Analysis
Authors
, , ,