Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
455707 | Computers & Electrical Engineering | 2013 | 18 Pages |
Inspired by bagging and boosting algorithms in classification, the non-weighing and weighing-based sampling approaches for clustering are proposed and studied in the paper. The effectiveness of non-weighing-based sampling technique, comparing the efficacy of sampling with and without replacement, in conjunction with several consensus algorithms have been invested in this paper. Experimental results have shown improved stability and accuracy for clustering structures obtained via bootstrapping, subsampling, and boosting techniques. Subsamples of small size can reduce the computational cost and measurement complexity for many unsupervised data mining tasks with distributed sources of data. This empirical research study also compares the performance of boosting and bagging clustering ensembles using different consensus functions on a number of datasets.
Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slideHighlights► Here a new method has been proposed for clustering ensemble by boosting sampling of original data. ► We have extended a new framework by boosting data sampling mechanism for generation of partitionings. ► Several consensus functions have been compared with each other in details. ► We find out boosting sampling method with the use of MCLA consensus function constructs a good ensemble. ► We fine out in boosting clustering ensemble, 10 base partitionings is best option for ensemble size.