FAUM: Fast Autonomous Unsupervised Multidimensional classification

Article ID	Journal	Published Year	Pages	File Type
6856237	Information Sciences	2018	41 Pages	PDF

Abstract

This article presents Faum: Fast Autonomous Unsupervised Multidimensional, an automatic clustering algorithm that can discover natural groupings in unlabeled data. Faum is aimed to optimize the resources provided by a modern computer to process big datasets. The present algorithm can find disjoint spherical symmetrical clusters in a deterministic way and without the indication of the number of clusters to find, iterations or initializations. Faum is remarkably fast compared to K-Means when a big multidimensional set of data is processed. Since Faum has an average O(N) space and time complexity, it can process datasets of several hundred megabytes size in less than a minute on a standard laptop computer. Furthermore, Faum is not sensitive to outliers and may be used by itself or to provide the whole initialization for a deterministic K-Means processing.

Keywords

Fast clustering Big Data