کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
397489 671251 2011 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A hybrid approach for estimating document frequencies in unstructured P2P networks
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
A hybrid approach for estimating document frequencies in unstructured P2P networks
چکیده انگلیسی

Scalable search and retrieval over numerous web document collections distributed across different sites can be achieved by adopting a peer-to-peer (P2P) communication model. Terms and their document frequencies are the main components of text information retrieval and as such need to be computed, aggregated, and distributed throughout the system. This is a challenging problem in the context of unstructured P2P networks, since the local document collections may not reflect the global collection in an accurate way. This might happen due to skews in the distribution of documents to peers. Moreover, central assembly of the total information is not a scalable solution due to the excessive cost of storage and maintenance, and because of issues related to digital rights management. In this paper, we present an efficient hybrid approach for aggregation of document frequencies using a hierarchical overlay network for a carefully selected set of the most important terms, together with gossip-based aggregation for the remaining terms in the collections. Furthermore, we present a cost analysis to compute the communication cost of hybrid aggregation. We conduct experiments on three document collections, in order to evaluate the quality of the proposed hybrid aggregation.

Research highlights
► Both hierarchical and gossiping aggregation benefit P2P systems.
► The combination of both approaches is shown to be feasible by means of cost functions.
► Hybrid aggregation performs well for variable setups.
► We assess the results of both term ranking quality and document retrieval.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Systems - Volume 36, Issue 3, May 2011, Pages 579–595
نویسندگان
, , ,