دانلود رایگان مقاله: نمایه سازی مشابه و همپوشانی برای داده های با کارایی بالا

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
4956476	1444519	2017	22 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Resemblance and mergence based indexing for high performance data deduplication

ترجمه فارسی عنوان

نمایه سازی مشابه و همپوشانی برای داده های با کارایی بالا

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

شاخص سریع، تقلید کردن، شباهت مشابهی، بازیابی اثر انگشت، شاخص ارزش کلیدی،

Fingerprint retrieval Deduplication - تقلید کردن

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات

پیش نمایش مقاله

نمایه سازی مشابه و همپوشانی برای داده های با کارایی بالا

چکیده انگلیسی

Data deduplication, a data redundancy elimination technique, has been widely employed in many application environments to reduce data storage space. However, it is challenging to provide a fast and scalable key-value fingerprint index particularly for large datasets, while the index performance is critical to the overall deduplication performance. This paper proposes RMD, a resemblance and mergence based deduplication scheme, which aims to provide quick responses to fingerprint queries. The key idea of RMD is to leverage a bloom filter array and a data resemblance algorithm to dramatically reduce the query range. At data ingesting time, RMD uses a resemblance algorithm to detect resemble data segments and put resemblance segments in the same bin. As a result, at querying time, it only needs to search in the corresponding bin to detect duplicate content, which significantly speeds up the query process. Moreover, RMD uses a mergence strategy to accumulate resemblance segments to relevant bins, and exploits frequency-based fingerprint retention policy to cap the bin capacity to improve query throughput and data deduplication ratio. Extensive experimental results with real-world datasets have shown that RMD is able to achieve high query performance and outperforms several well-known deduplication schemes.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Systems and Software - Volume 128, June 2017, Pages 11-24

نویسندگان

Panfeng Zhang, Ping Huang, Xubin He, Hua Wang, Ke Zhou,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : نمایه سازی مشابه و همپوشانی برای داده های با کارایی بالا

دسترسی سریع

ارتباط

English Website