Article ID Journal Published Year Pages File Type
6865594 Neurocomputing 2015 10 Pages PDF
Abstract
This paper deals with Principal Components Analysis (PCA) of data spread over a network where central coordination and synchronous communication between networking nodes are forbidden. We propose an asynchronous and decentralized PCA algorithm dedicated to large scale problems, where “large” simultaneously applies to dimensionality, number of observations and network size. It is based on the integration of a dimension reduction step into a gossip consensus protocol. Unlike other approaches, a straightforward dual formulation makes it suitable when observed dimensions are distributed. We theoretically show its equivalence with a centralized PCA under a low-rank assumption on training data. An experimental analysis reveals that it achieves a good accuracy with a reasonable communication cost even when the low-rank assumption is relaxed.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , ,