Article ID Journal Published Year Pages File Type
477076 European Journal of Operational Research 2011 10 Pages PDF
Abstract

Exact global optimization of the clusterwise regression problem is challenging and there are currently no published feasible methods for performing this clustering optimally, even though it has been over thirty years since its original proposal. This work explores global optimization of the clusterwise regression problem using mathematical programming and related issues. A mixed logical-quadratic programming formulation with implication of constraints is presented and contrasted against a quadratic formulation based on the traditional big-M, which cannot guarantee optimality because the regression line coefficients, and thus errors, may be arbitrarily large. Clusterwise regression optimization times and solution optimality for two clusters are empirically tested on twenty real datasets and three series of synthetic datasets ranging from twenty to one hundred observations and from two to ten independent variables. Additionally, a few small real datasets are clustered into three lines.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)
Authors
, , ,