Efficient clustering of databases induced by local patterns

Article ID	Journal	Published Year	Pages	File Type
553937	Decision Support Systems	2008	19 Pages	PDF

Abstract

Many large organizations have multiple large databases as they transact from multiple branches. Most of the previous pieces of work are based on a single database. Thus, it is necessary to study data mining on multiple databases. In this paper, we propose two measures of similarity between a pair of databases. Also, we propose an algorithm for clustering a set of databases. Efficiency of the clustering process has been improved using the following strategies: reducing execution time of clustering algorithm, using more appropriate similarity measure, and storing frequent itemsets space efficiently.

Keywords

Multi-database mining Local pattern analysis Clustering