Article ID Journal Published Year Pages File Type
427174 Information Processing Letters 2013 6 Pages PDF
Abstract

•We present a new on-line algorithm to find frequent item sets in a P2P network monitoring data streams.•Proposing an asynchrony algorithm.•Presenting a good locality.•Decreasing communication load.•Increasing convergence rate according to the recall ratio and the precision ratio.

Data intensive large-scale distributed systems like peer-to-peer (P2P) networks are finding large number of applications for social networking, file sharing networks, etc. Global data mining in such P2P environments may be very costly due to the high scale and the asynchronous nature of the P2P networks. The cost further increases in the distributed data stream scenario where peers receive continuous sequence of transactions rapidly. In this paper, we develop an efficient local algorithm, P2P-FISM, for discovering of the network-wide recent frequent itemsets. The algorithm works in a completely asynchronous manner, imposes low communication overhead, a necessity for scalability, transparently tolerates network topology changes, and quickly adapts to changes in the data stream. The paper demonstrates experimental results to corroborate the theoretical claims.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , ,