Article ID Journal Published Year Pages File Type
4959538 European Journal of Operational Research 2017 37 Pages PDF
Abstract
A new combinatorial model for clustering is proposed for all applications in which individual and relational data are available. Individual data refer to the intrinsic features of units, they are stored in a matrix D, and are the typical input of all clustering algorithms proposed so far. Relational data refer to the observed links between units, representing social ties such as friendship, joint participation to social events, and so on. Relational data are stored in the graph G=(V,E), and the data available for clustering are the triplet G=(V,E,D), called attributed graph. Known clustering algorithms can take advantage of the relational structure of G to redefine and refine the units membership. For example, uncertain membership of units to groups can be resolved using the sociological principle that ties are more likely to form between similar units. The model proposed here shows how to take into account the graph information, combining the clique partitioning objective function (a known clustering methodology) with connectivity as the structural constraint of the resulting clusters. The model can be formulated and solved using Integer Linear Programming and a new family of cutting planes. Moderate size problems are solved, and heuristic procedures are developed for instances in which the optimal solution can only be approximated. Finally, tests conducted on simulated data show that the clusters quality is greatly improved through this methodology.
Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)
Authors
, , ,