Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
427174 | Information Processing Letters | 2013 | 6 Pages |
•We present a new on-line algorithm to find frequent item sets in a P2P network monitoring data streams.•Proposing an asynchrony algorithm.•Presenting a good locality.•Decreasing communication load.•Increasing convergence rate according to the recall ratio and the precision ratio.
Data intensive large-scale distributed systems like peer-to-peer (P2P) networks are finding large number of applications for social networking, file sharing networks, etc. Global data mining in such P2P environments may be very costly due to the high scale and the asynchronous nature of the P2P networks. The cost further increases in the distributed data stream scenario where peers receive continuous sequence of transactions rapidly. In this paper, we develop an efficient local algorithm, P2P-FISM, for discovering of the network-wide recent frequent itemsets. The algorithm works in a completely asynchronous manner, imposes low communication overhead, a necessity for scalability, transparently tolerates network topology changes, and quickly adapts to changes in the data stream. The paper demonstrates experimental results to corroborate the theoretical claims.