کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4956476 | 1444519 | 2017 | 22 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Resemblance and mergence based indexing for high performance data deduplication
ترجمه فارسی عنوان
نمایه سازی مشابه و همپوشانی برای داده های با کارایی بالا
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
شاخص سریع، تقلید کردن، شباهت مشابهی، بازیابی اثر انگشت، شاخص ارزش کلیدی،
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
شبکه های کامپیوتری و ارتباطات
چکیده انگلیسی
Data deduplication, a data redundancy elimination technique, has been widely employed in many application environments to reduce data storage space. However, it is challenging to provide a fast and scalable key-value fingerprint index particularly for large datasets, while the index performance is critical to the overall deduplication performance. This paper proposes RMD, a resemblance and mergence based deduplication scheme, which aims to provide quick responses to fingerprint queries. The key idea of RMD is to leverage a bloom filter array and a data resemblance algorithm to dramatically reduce the query range. At data ingesting time, RMD uses a resemblance algorithm to detect resemble data segments and put resemblance segments in the same bin. As a result, at querying time, it only needs to search in the corresponding bin to detect duplicate content, which significantly speeds up the query process. Moreover, RMD uses a mergence strategy to accumulate resemblance segments to relevant bins, and exploits frequency-based fingerprint retention policy to cap the bin capacity to improve query throughput and data deduplication ratio. Extensive experimental results with real-world datasets have shown that RMD is able to achieve high query performance and outperforms several well-known deduplication schemes.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Systems and Software - Volume 128, June 2017, Pages 11-24
Journal: Journal of Systems and Software - Volume 128, June 2017, Pages 11-24
نویسندگان
Panfeng Zhang, Ping Huang, Xubin He, Hua Wang, Ke Zhou,