Similarity caching in large-scale image retrieval

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
515634	867057	2012	16 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Content-based search - جستجوی مبتنی بر محتوا Caching - ذخیره سازی Large scale - مقیاس بزرگ

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر

پیش نمایش صفحه اول مقاله

Similarity caching in large-scale image retrieval

چکیده انگلیسی

Feature-rich data, such as audio-video recordings, digital images, and results of scientific experiments, nowadays constitute the largest fraction of the massive data sets produced daily in the e-society. Content-based similarity search systems working on such data collections are rapidly growing in importance. Unfortunately, similarity search is in general very expensive and hardly scalable.In this paper we study the case of content-based image retrieval (CBIR) systems, and focus on the problem of increasing the throughput of a large-scale CBIR system that indexes a very large collection of digital images. By analyzing the query log of a real CBIR system available on the Web, we characterize the behavior of users who experience a novel search paradigm, where content-based similarity queries and text-based ones can easily be interleaved. We show that locality and self-similarity is present even in the stream of queries submitted to such a CBIR system. According to these results, we propose an effective way to exploit this locality, by means of a similarity caching system, which stores the results of recently/frequently submitted queries and associated results. Unlike traditional caching, the proposed cache can manage not only exact hits, but also approximate ones that are solved by similarity with respect to the result sets of past queries present in the cache. We evaluate extensively the proposed solution by using the real query stream recorded in the log and a collection of 100 millions of digital photographs. The high hit ratios and small average approximation error figures obtained demonstrate the effectiveness of the approach.

Research highlights
► We propose similarity-based caching techniques to increase the throughput of a large-scale CBIR.
► We exploit knowledge from a query log of a real large-scale CBIR system.
► Our experiments are conducted on a very large collection containing 100 million images.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 48, Issue 5, September 2012, Pages 803–818

نویسندگان

Fabrizio Falchi, Claudio Lucchese, Salvatore Orlando, Raffaele Perego, Fausto Rabitti,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Similarity caching in large-scale image retrieval

دسترسی سریع

ارتباط

English Website