Article ID Journal Published Year Pages File Type
4965239 Computers, Environment and Urban Systems 2017 10 Pages PDF
Abstract
Common clustering techniques, such as K-Means, for remote sensing images usually suffer from initial starting conditions effects. Kaufman's initialization can provide a set of initial centers of clusters to produce stable and accurate clustering results for remote sensing images. However, the most notable drawback of Kaufman's initialization is that it is computationally expensive and its performance is further challenged when it is applied to large remote sensing images. In this paper, we present a MapReduce-based Parallel Kaufman (MPK) implementation for accelerating the initialization step of clustering. As part of MPK, Grid-based Sequential Systematic Sampling (GS3), a new data partitioning method for remote sensing images, is also presented. GS3, unlike the conventional area-based data partitioning method, is designed specifically for parallel Kaufman implementation. MPK encompasses four key components and was implemented on the Hadoop cluster on a private cloud. Experiments, conducted on a number of remote sensing images with different sizes, show very promising results in terms of significant speedup.
Related Topics
Physical Sciences and Engineering Computer Science Computer Science Applications
Authors
, , ,