A new distance with derivative information for functional k-means clustering algorithm

Article ID	Journal	Published Year	Pages	File Type
6856212	Information Sciences	2018	20 Pages	PDF

Abstract

The functional k-means clustering algorithm is a widely used method for clustering functional data. However, with this algorithm, the derivative information is not further considered in calculating the similarity between two functional samples. In fact, the derivative information is very important for catching the trend characteristic differences among functional data. In this paper, we define a novel distance used to measure the similarity among functional samples by adding their derivative information. Furthermore, in theory, we construct cluster centroids that can minimize the objective function of the functional k-means clustering algorithm based on the proposed distance. After preprocessing functional data using three types of common basis representation techniques, we compare the clustering performance of the functional k-means clustering algorithms based on four different similarity metrics. The experiments on six data sets with class labels show the effectiveness and robustness of the functional k-means clustering algorithm with the defined distance statistically. In addition, the experimental results on three real-life data sets verify the convergence and practicability of the functional k-means clustering algorithm with the defined distance.

Keywords

variational theory Functional data