کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
383082 660801 2014 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
SBBS: A sliding blocking algorithm with backtracking sub-blocks for duplicate data detection
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
SBBS: A sliding blocking algorithm with backtracking sub-blocks for duplicate data detection
چکیده انگلیسی


• It clearly analyzes inserting and deleting operations in the traditional SB algorithm.
• It proposes the concept of matching-failed segments due to above operations.
• It proposes an efficient sliding blocking algorithm with backtracking sub-blocks.
• SBBS can detect duplicate data as many as possible in matching-failed segments.

With the explosive growth of data, storage systems are facing huge storage pressure due to a mass of redundant data caused by the duplicate copies or regions of files. Data deduplication is a storage-optimization technique that reduces the data footprint by eliminating multiple copies of redundant data and storing only unique data. The basis of data deduplication is duplicate data detection techniques, which divide files into a number of parts, compare corresponding parts between files via hash techniques and find out redundant data. This paper proposes an efficient sliding blocking algorithm with backtracking sub-blocks called SBBS for duplicate data detection. SBBS improves the duplicate data detection precision of the traditional sliding blocking (SB) algorithm via backtracking the left/right 1/4 and 1/2 sub-blocks in matching-failed segments. Experimental results show that SBBS averagely improves the duplicate detection precision by 6.5% compared with the traditional SB algorithm and by 16.5% compared with content-defined chunking (CDC) algorithm, and it does not increase much extra storage overhead when SBBS divides the files into equal chunks of size 8 kB.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 41, Issue 5, April 2014, Pages 2415–2423
نویسندگان
, , , ,